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SET THEORY 


{. Sets and Functions 


1.1. Basic definitions. Mathematics habitually deals with “sets” made up 
of ‘“‘elements’”’ of various kinds, e.g., the set of faces of a polyhedron, the 
set of points on a line, the set of all positive integers, and so on. Because of 
their generality, it is hard to define these concepts in a way that does more 
than merely replace the word “‘set’’ by some equivalent term like “‘class,”’ 
“family,’’ “collection,” etc. and the word “element’’ by some equivalent 
term like ““member.”’ We will adopt a “‘naive’’ point of view and regard the 
notions of a set and the elements of a set as primitive and well-understood. 

The set concept plays a key role in modern mathematics. This is partly 
due to the fact that set theory, originally developed towards the end of the 
nineteenth century, has by now become an extensive subject in its own right. 
More important, however, is the great influence which set theory has exerted 
and continues to exert on mathematical thought as a whole. In this chapter, 
we introduce the basic set-theoretic notions and notation to be used in the 
rest of the book. 

Sets will be denoted by capital letters like A, B,..., and elements of 
sets by small letters like a, b,.... The set with elements a, b,c, ... is often 
denoted by {a,b,c,...}, ie., by writing the elements of the set between 
curly brackets. For example, {1} is the set whose only member is 1, while 
{1,2,...,”,...} is the set of all positive integers. The statement “‘the 
element a belongs to the set A’’ is written symbolically as a¢A, while 
a¢A means that “the element a does not belong to the set A.’’ If every 
element of a set A also belongs to a set B, we say that A is a subset of the 
set B and write A < B or B> A (also read as “A is contained in B” or 
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“B contains A”). For example, the set of all even numbers is a subset of the 
set of all real numbers. We say that two sets A and B are equal and write 
A = B if A and B consist of precisely the same elements. Note that 4 = B 
if and only if A © Band BC A, i.e., if and only if every element of A is an 
element of B and every element of Bis an element of A. If A © Bbut A ~ B, 
we call A a proper subset of B. 

Sometimes it is not known in advance whether or not a certain set (for 
example, the set of roots of a given equation) contains any elements at all. 
Thus it is convenient to introduce the concept of the empty set, i.e., the set 
containing no elements at all. This set will be denoted by the symbol 2. 
The set @ is clearly a subset of every set (why ?). 
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1.2. Operations on sets. Let A and B be any two sets. Then by the sum 
or union of A and B, denoted by A U B, is meant the set consisting of all 
elements which belong to at least one of the sets A and B (see Figure 1). 
More generally, by the sum or union of an arbitrary number (finite or in- 
finite) of sets A, (indexed by some parameter «), we mean the set, denoted by 


U A,, 
a 


of all elements belonging to at least one of the sets A,. 

By the intersection A ( B of two given sets A and B, we mean the set 
consisting of all elements which belong to both A and B (see Figure 2). For 
example, the intersection of the set of all even numbers and the set of all 
integers divisible by 3 is the set of all integers divisible by 6. By the inter- 
section of an arbitrary number (finite or infinite) of sets A,, we mean the 
set, denoted by 

nN Aa, 


of all elements belonging to every one of the sets A,. Two sets A and B are 
said to be disjoint if A 1 B = @, ie., if they have no elements in common. 
More generally, let * be a family of sets such that 4 A B= @ for every 
pair of sets A, Bin #. Then the sats in ¥ are said to be pairwise disjoint. 


SEC. 1 SETS AND FUNCTIONS 3 
It is an immediate consequence of the above definitions that the operations 
U and are commutative and associative, i.e., that 
AUB=BUA, (AUB)UC=AU(BUO), 
ANB=BQOA, (ANB) NC=AN(BNC). 
Moreover, the operations U and 1 obey the following distributive laws: 
(A UB)NC=(ANOC)U(BNOY), (1) 
(ANB)UCH=(AUCHN(BUC). (2) 
For example, suppose x € (A U B) 1 C, so that x belongs to the left-hand 
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side of (1). Then x belongs to both C and A U B, i.e., x belongs to both 
C and at least one of the sets 4 and B. But then x belongs to at least one of 
the sets A 1 C and BOC, i.e., xE€(A OC) U (BOC), so that x belongs 
to the right-hand side of (1). Conversely, suppose x € (A NC) U(B NC). 
Then x belongs to at least one of the two sets 4 A C and BOC. It follows 
that x belongs to both C and at least one of the two sets A and B, ie., x EC 
and x € A U B or equivalently x € (A U B) NC. This proves (1), and (2) is 
proved similarly. 

By the difference A — B between two sets A and B (in that order), we 
mean the set of all elements of A which do not belong to B (see Figure 3). 
Note that it is not assumed that 4 > B, It is sometimes convenient (e.g., in 
measure theory) to consider the symmetric difference of two sets A and B, 
denoted by A A B and defined as the union of the two differences A — B 
and B — A (see Figure 4): 


AAB=(A—B)U(B— 4A). 


We will often be concerned later with various sets which are all subsets 
of some underlying basic set R, for example, various sets of points on the 
real line. In this case, given a set A, the difference R — A is called the 
complement of A, denoted by CA. 
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An important role is played in set theory and its applications by the 
following “duality principle” ; 


R~UA,= N(R — A,), (3) 
R— fA, = U(R— A). (4) 


In words, the complement of a union equals the intersection of the comple- 
ments, and the complement of an intersection equals the union of the 
complements. According to the duality principle, any theorem involving a 
family of subsets of a fixed set R can be converted automatically into another, 
“dual” theorem by replacing all subsets by their complements, all unions 
by intersections and all intersections by unions. To prove (3), suppose 


xeR—UA,. (5) 
Then x does not belong to the union 
UA,, (6) 


i.e., x does not belong to any of the sets A,. It follows that x belongs to each 
of the complements R — A,, and hence 


xe nN (R — A,). (7) 


Conversely, suppose (7) holds, so that x belongs to every set R — A,. Then 
x does not belong to any of the sets A,, i.e., x does not belong to the union 
(6), or equivalently (5) holds. This proves (3), and (4) is proved similarly 
(give the details). 


Remark. The designation ‘“‘symmetric difference’’ for the set A A B is 
not too apt, since A A B has much in common with the sum 4 U B. In fact, 
in A U B the two statements “x belongs to A’ and “x belongs to B’’ are 
joined by the conjunction “or” used in the “either... or... or both...” 
sense, while in A A B the same two statements are joined by “or’’ used in the 
ordinary “either ...or.. .’’ sense (as in “to be or not to be’’). In other words, 
x belongs to A U B if and only if x belongs to either 4 or B or both, while x 
belongs to A A Bif and only if x belongs to either A or B but not both. The 
set A A Bcan be regarded as a kind of ‘“‘modulo-two sum”’ of the sets A and 
B, i.e., a sum of the sets 4 and B in which elements are dropped if they are 
counted twice (once in A and once in B). 


1.3. Functions and mappings. Images and preimages. A rule associating a 
unique real number y = f(x) with each element of a set of real numbers ¥ 
is said to define a (real) function f on X. The set X is called the domain 
(of definition) of f, and the set Y of all numbers f(x) such that x € X is called 
the range of f. 
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More generally, let M and N be two arbitrary sets. Then a rule associating 
a unique element 6 = f(a) € N with each element a € M is again said to define 
a function f on M (or a function f with domain M). In this more general 
context, f is usually called a mapping of M into N. By the same token, f is 
said to map M into N (and a into B). 

If a is an element of M, the corresponding element 6 = f(a) is called the 
image of a (under the mapping f). Every element of M with a given element 
be Nas its image is called a preimage of b. Note that in general b may have 
several preimages. Moreover, N may contain elements with no preimages 
at all. If b has a unique preimage, we denote this preimage by f—1(0). 

If A is a subset of M, the set of all elements f(a) € N such that ae A 
is called the image of A, denoted by f(A). The set of all elements of M whose 
images belong to a given set BC N is called the preimage of B, denoted 
by f-1(B). If no element of B has a preimage, then f-1(B) = @. A function 
J is said to map M into N if f(M) < N, as is always the case, and onto N 
if f(M) = N1 Thus every “onto mapping” is an “into mapping,’’ but not 
conversely. 

Suppose f maps M onto N. Then fis said to be one-to-one if each element 
beEN has a unique preimage f—1(5). In this case, f is said to establish a 
one-to-one correspondence between M and N, and the mapping f-} associ- 
ating f—1(d) with each b € N is called the inverse of f. 


THEOREM |. The preimage of the union of two sets is the union of the 
preimages of the sets: 

f(A UB) =f (A) VfB). 

Proof. If x ef-(A U B), then f(x) € A UB, so that f(x) belongs 
to at least one of the sets A and B. But then x belongs to at least one of 
the sets f-1(A) and f-1(B), ie., x Ef (A) USB). 

Conversely, if x e f-1(A) U f-1(B), then x belongs to at least one 
of the sets f-1(4) and f-1(B). Therefore f(x) belongs to at least one of 
the sets A and B,ie., f(x)e A UB. Butthenxef-(A UB). FP 


THEOREM 2. The preimage of the intersection oj two sets is the inter- 

section of the preimages of the sets: 
fA 0 B) = fA) OF 7B). 

Proof. If x ef-(A OB), then f(x) € A OB, so that f(x) € A and 
f(x) €B. But then x €f-1(A) and x €f(B), ie., x Ef (4) Of-1(B). 

Conversely, if x € f-*(A) A f-1(B), then x e¢f—1(A) and x € f-1(B). 
Therefore f(x)€A and f(x)EB, ie, f(x)eA OB. But then xe 
fA). E 


1 As in the case of real functions, the set f(M) is called the range of f. 
* The symbol § stands for Q.E.D. and indicates the end of a proof. 
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THEOREM 3. The image of the union of two sets equals the union of the 
images of the sets: 


S(A'VU B) = f(A) Uf (B). 


Proof. Ifye f(A U B), then y = f(x) where x belongs to at least one 
of the sets A and B. Therefore y = f(x) belongs to at least one of the sets 
f(A) and f(B), ie., ye f(A) U f(B). 

Conversely, if ye f(A) U f(B), then y = f(x) where x belongs to at 
least one of the sets A and B, ie., xe A UB and hence y = f(x) € 
f(AUB). i 


Remark J. Surprisingly enough, the image of the intersection of two sets 
does not necessarily equal the intersection of the images of the sets. For 
example, suppose the mapping f projects the xy-plane onto the x-axis, 
carrying the point (x,y) into the (x,0). Then the segments 0 < x < l, 
y = 0Oand0 < x < 1, y = 1 do not intersect, although their images coincide. 


Remark 2. Theorems 1-3 continue to hold for unions and intersections 
of an arbitrary number (finite or infinite) of sets A,: 


7>°(U4,) = Usd) 
r(0 4.) = OF 40) 
AU A.) = Uf(da). 


1.4, Decomposition of a set into classes. Equivalence relations. Decom- 
positions of a given set into pairwise disjoint subsets play an important role 
in a great variety of problems. For example, the plane (regarded as a point 
set) can be decomposed into lines parallel to the x-axis, three-dimensional 
space can be decomposed into concentric spheres, the inhabitants of a given 
city can be decomposed into different age groups, and so on. Any such 
representation of a given set M as the union of a family of pairwise disjoint 
subsets of M is called a decomposition or partition of M into classes. 

A decomposition is usually made on the basis of some criterion, allowing 
us to assign the elements of M to one class or another. For example, the 
set of all triangles in the plane can be decomposed into classes of congruent 
triangles or into classes of triangles of equal area, the set of all functions 
of x can be decomposed into classes of functions all taking the same value at 
a given point x, and so on. Despite the great variety of such criteria, they 
are not completely arbitrary. For example, it is obviously impossible to 
partition all real numbers into classes by assigning the number 6 to the same 
class as the number a if and only if 6 > a. In fact, if b > a, b must be 


SEC. 1 SETS AND FUNCTIONS 7 


assigned to the same class as a, but then a cannot be assigned to the same 
class as b, since a < 6. Moreover, since a is not greater than itself, a cannot 
even be assigned to the class containing itself! As another example, it is 
impossible to partition the points of the plane into classes by assigning two 
points to the same class if and only if the distance between them is less than 1. 
In fact, if the distance between a and 5 is less than 1 and if the distance 
between 5 and c is less than 1, it does not follow that the distance between 
a and c is less than 1. Thus, by assigning a to the same class as 5 and 6 to 
the same class as c, we may well find that two points fall in the same class 
even though the distance between them is greater than 1! 

These examples suggest conditions which must be satisfied by any criterion 
if it is to be used as the basis for partitioning a given set into classes. Let 
M be a set, and Jet certain ordered pairs (a, b) of elements of M be called 
“labelled.” If (a, 5) is a labelled pair, we say that a is related to b by the 
(binary) relation R and write aRb.? For example, if a and 6 are real numbers, 
aRb might mean a < 4, while if a and 6 are triangles, aRb might mean that 
a and b have the same area. A relation between elements of M is called 
a relation on M if there is at least one labelled pair (a, 5) for every ae M. 
A relation R on M is called an equivalence relation (on M) if it satisfies the 
following three conditions: 


1) Reflexivity: aRa for every ae M; 
2) Symmetry: If aRb, then bRa; 
3) Transitivity: If aRb and bRc, then aRc. 


THEOREM 4. A set M can be partitioned into classes by a relation R 
(acting as a criterion for assigning two elements to the same class) if and 
only if R is an equivalence relation on M. 


Proof. Every partition of M determines a binary relation on M, where 
aRb means that ‘a belongs to the same class as b.’’ It is then obvious 
that R must be reflexive, symmetric and transitive, i.e., that R is an 
equivalence relation on M. 

Conversely, let R be an equivalence relation on M, and let K, be the 
set of all elements x € M such that xRa (clearly ae K,, since R is 
reflexive). Then two classes K, and K, are either identical or disjoint. 
In fact, suppose an element c belongs to both K, and K,, so that cRa 
and cRb. Then aRc by the symmetry, and hence 


aRb (8) 


3 Put somewhat differently, let M? be the set of all ordered pairs (a, 5) with a, be M, 
and let @ be the subset of M? consisting of all labelled pairs. Then aRb if and only if 
(a, b)€ &, i.e., a binary relation is essentially just a subset of M*. As an exercise, state 
the three conditions for R to be an equivalence relation in terms of ordered pairs and the 
set 2. 
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by the transitivity. If now x € K,, then xRa and hence xRb by (8) and the 
transitivity, i.e., x €K,. Virtually the same argument shows that x € K, 
implies x 6 K,. Therefore“K, = K, if K, and K, have an element in 
common. Therefore the distinct sets K, form a partition of M into 
classes. §f 


Remark. Because of Theorem 4, one often talks about the decomposition 
of M into equivalence classes. 


There is an intimate connection between mappings and partitions into 
classes, as shown by the following examples: 


Example 1. Let f be a mapping of a set A into a set B and partition A 
into sets, each consisting of all elements with the same image b = f(a) € B. 
This gives a partition of A into classes. For example, suppose f projects 
the xy-plane onto the x-axis, by mapping the point (x, y) into the point 
(x, 0). Then the preimages of the points of the x-axis are vertical lines, and 
the representation of the plane as the union of these lines is the decomposition 
into classes corresponding to /. 


Example 2. Given any partition of a set A into classes, let B be the set of 
these classes and associate each element a € A with the class (i.e., element 
of B) to which it belongs. This gives a mapping of A into B. For example, 
suppose we partition three-dimensional space into classes by assigning to the 
same class all points which are equidistant from the origin of coordinates. 
Then every class is a sphere of a certain radius. The set of all these classes 
can be identified with the set of points on the half-line [0, 00), each point 
corresponding to a possible value of the radius. In this sense, the decom- 
position of space into concentric spheres corresponds to the mapping of 
space into the half-line [0, 00). 


Example 3. Suppose we assign all real numbers with the same fractional 
part? to the same class. Then the mapping corresponding to this partition 
has the effect of “winding”’ the real line onto a circle of unit circumference. 

Problem 1, Prove that if 4 UB=Aand A OB=A,thenA=B. 

Problem 2. Show that in general (A — B) UB A. 

Problem 3. Let A = {2,4,...,2n,...} and B= {3,6,...,3n,...}. 
Find 4 O Band A — B. 


4 The largest integer <x is called the integral part of x, denoted by [x], and the quantity 
x — [x] is called the fractional part of x. 
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Problem 4. Prove that 
a) (A—B)NC=(ANC)—-(BN0C); 
b) AA B=(A UB)— (ANB). 
Problem 5. Prove that 
UA, — UB, U(A, — B,). 


Problem 6. Let A,, be the set of all positive integers divisible by n. Find 
the sets 


a) UA,; b) NA, 
n=2 n=2 

Problem 7. Find 

‘ Ula+-.5—2]; b) A(a—7.6+)). 
n=1 n n n-=1 n n 


Problem 8. Let A, be the set of points lying on the curve 


LS (0<x < o). 


What is 
NM A,? 


Problem 9. Let y = f(x) = (x) for all real x, where (x) is the fractional 
part of x. Prove that every closed interval of length 1 has the same image 
under f. What is this image? Is f one-to-one? What is the preimage of the 
interval } < y < }? Partition the real line into classes of points with the 
same image. 


Problem 10. Given a set M, let & be the set of all ordered pairs on the 
form (a, a) with a eM, and let aRd if and only if (a, 6) € &. Interpret the 
relation R. 


Problem 11. Give an example of a binary relation which is 


a) Reflexive and symmetric, but not transitive; 

b) Reflexive, but neither symmetric nor transitive; 
c) Symmetric, but neither reflexive nor transitive; 
d) Transitive, but neither reflexive nor symmeétric. 


2. Equivalence of Sets. The Power of a Set 


2.1. Finite and infinite sets. The set of all vertices of a given polyhedron, 
the set of all prime numbers less than a given number, and the set of all 
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residents of New York City (at a given time) have a certain property in 
common, namely, each set has a definite number of elements which can be 
found in principle, if not in,practice. Accordingly, these sets are all said to 
be finite. Clearly, we can be sure that a set is finite without knowing the 
number of elements in it. On the other hand, the set of all positive integers, 
the set of all points on the line, the set of all circles in the plane, and the 
set of all polynomials with rational coefficients have a different property 
in common, namely, if we remove one element from each set, then remove 
two elements, three elements, and so on, there will still be elements Jeft in 
the set at each stage. Accordingly, sets of this kind are said to be infinite. 

Given two finite sets, we can always decide whether or not they have the 
same number of elements, and if not, we can always determine which set 
has more elements than the other. It is natural to ask whether the same is 
true of infinite sets. In other words, does it make sense to ask, for example, 
whether there are more circles in the plane than rational points on the line, 
or more functions defined in the interval [0, 1] than lines in space? As will 
soon be apparent, questions of this kind can indeed be answered. 

To compare two finite sets A and B, we can count the number of elements 
in each set and then compare the two numbers, but alternatively, we can try 
to establish a one-to-one correspondence between (the elements of) A and B, 
ie., a correspondence such that each element in A corresponds to one and 
only one element in B and vice verse. It is clear that a one-to-one corre- 
spondence between two finite sets can be set up if and only if the two sets 
have the same number of elements. For example, to ascertain whether or 
not the number of students in an assembly is the same as the number of 
seats in the auditorium, there is no need to count the number of students 
and the number of seats. We need merely observe whether or not there are 
empty seats or students with no place to sit down. If the students can all 
be seated with no empty seats left, i.e., if there is a one-to-one correspondence 
between the set of students and the set of seats, then these two sets obviously 
have the same number of elements. The important point here is that the 
first method (counting elements) works only for finite sets, while the second 
method (setting up a one-to-one correspondence) works for infinite sets as 
well as for finite sets. 


2.2. Countable sets. The simplest infinite set is the set Z, of all positive 
integers. An infinite set is called countable if its elements can be put in one-to- 
one correspondence with those of Z,. In other words, a countable set is a 
set whose elements can be numbered 4), a),...,a,,.... By an uncountable 
set we mean, of course, an infinite set which is not countable. 

We now give some examples of countable sets: 


Example 1. The set Z of all integers, positive, negative or zero, is 
countable. In fact, we can set up the following one-to-one correspondence 
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between Z and the set Z, of all positive integers: 

0, —I, 1, —2, 2,... 

ty, Bind ~ Ap cRe 
More explicitly, we associate the nonnegative integer n > 0 with the odd 
number 2n + 1, and the negative integer n < 0 with the even number 2 |x|, 
ie., 

nov2nt+i if n>0, 

n<>2|nl if n<0 
(the symbol <> denotes a one-to-one correspondence). 


Example 2. The set of all positive even numbers is countable, as shown 
by the obvious correspondence n <> 2n. 


Example 3. The set 2, 4, 8,..., 2",... of powers of 2 is countable, as 
shown by the obvious correspondence n <> 2”. 


Example 4. The set Q of all rational numbers is countable. To see this, 
we first note that every rational number « can be written as a fraction p/q, 
q > 0 in lowest terms with a positive denominator. Call the sum |p| + q the 
“height”’ of the rational number «. For example, 


: = 
is the only rational number of height 0, 
—1 1 
qe 4 
are the only rational numbers of height 2, 
—2 —1 1 2 
T°. oe a 


are the only rational numbers of height 3, and so on. We can now arrange 
all rational numbers in order of increasing height (with the numerators 
increasing in each set of rational numbers of the same height). In other 
words, we first count the rational numbers of height 1, then those of height 
2 (suitably arranged), those of height 3, and so on. In this way, we assign 
every rational number a unique positive integer, i.e., we set up a one-to-one 
correspondence between the set Q of all rational numbers and the set Z,,. 
of all positive integers. 


Next we prove some elementary theorems involving countable sets: 
THEOREM 1. Every subset of a countable set is countable. 


Proof. Let A be countable, with elements a, a,,..., and let B be a 
subset of A. Among the elements a, ap,..., let a,,,@,,,... be those in 


12 SET THEORY CHAP. 1 


the set B. If the set of numbers 7, 7,,... has a largest number, then 
B is finite. Otherwise B is countable (consider the correspondence 
isa,). ff 


THEOREM 2, The union of a finite or countable number of countable 
sets Ay, Ag, ... is itself countable. 


* Proof. We can assume that no two of the sets A,, 4o,... have 
elements in common, since otherwise we could consider the sets 


Ay, Ay ax, Ay, Az — (A, U Ap), tee 


instead, which are countable by Theorem 1 and have the same union as 
the original sets. Suppose we write the elements of A,, A,,... in the 
form of an infinite table 


431 Az, Asn gq... (1) 


where the elements of the set A, appear in the first row, the elements of 
the set A, appear in the second row, and so on. We now count all the 
elements in (1) “‘diagonally,”’ i.e., first we choose a,,, then dy,, then dy,, 
and so on, moving in the way shown in the following table:® 


Qy1> An yg > Ay... 


x P< 

Qe, Ann ag ye. - 

ae cee 

G3, gg gg gg e (2) 
vA 


It is clear that this procedure associates a unique number to each element 
in each of the sets A,, As,..., thereby establishing a one-to-one 
correspondence between the union of the sets 4,, A,,... and the set 
Z,, of all positive integers. J 


THEOREM 3. Every infinite set has a countable subset. 


5 Discuss the obvious modifications of (1) and (2) in the case of only a finite number 
of sets Ay, Ag,.... 
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Proof. Let M be an infinite set and a, any element of M. Being in- 
finite, M contains an element a, distinct from a,, an element ag distinct 
from both a, and a, and so on. Continuing this process (which can 
never terminate due to a “shortage’’ of elements, since M is infinite), 
we get a countable subset 


A = {@y, dg, ... 5 ny. ot 
of the set M. fj 


Remark. Theorem 3 shows that countable sets are the “‘smallest’’ infinite 
sets. The question of whether there exist uncountable (infinite) sets will be 
considered below. 


2.3. Equivalence of sets. We arrived at the notion of a countable set M 
by considering one-to-one correspondences between M and the set Z, of all 
positive integers. More generally, we can consider one-to-one correspondences 
between any two sets M and N: 


DEFINITION. Two sets M and N are said to be equivalent (written 
M ~ N) if there is a one-to-one correspondence between the elements of 
M and the elements of N. 


The concept of equivalence is applicable to both finite and infinite sets. 
Two finite sets are equivalent if and only if they have the same number of 
elements. We can now define a countable set as a set equivalent to the set 
Z,, of all positive integers. It is clear that two sets which are equivalent to a 
third set are equivalent to each other, and in particular that any two countable 
sets are equivalent. 


Example 1. The sets of points in any two 
closed intervals [a, b] and [c, d] are equiv- 
alent, and Figure 5 shows how to set up a 
one-to-one correspondence between them. 
Here two points p and g correspond to each 
other if and only if they lie on the same ray 
emanating from the point O in which the 
extensions of the line segments ac and bd 
intersect. 


Example 2. The set of all points z in the Ficure 5 
complex plane is equivalent to the set of all 





6 Not to be confused with our previous use of the word in the phrase ‘‘equivalence 
relation.”” However, note that set equivalence is an equivalence relation in the sense of 
Sec. 1.4, being obviously reflexive, symmetric and transitive. Hence any family of sets 
can be partitioned into classes of equivalent sets. 
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‘points « on a sphere. In fact, a one-to- 
one correspondence z<+ « between the 
points of the two sets can be established 
by using stereographic projection, as 
shown in Figure 6 (O is the north pole 
of the sphere). 





Example 3. The set of all points x 
FIGURE 6 in the open unit interval (0, 1) is equiv- 
alent to the set of all points y on the 

whole real line. For example, the formula 


y= 1 arctan x A 
T 2 


establishes a one-to-one correspondence between these two sets. 


The last example and the examples in Sec. 2.2 show that an infinite set 
is sometimes equivalent to one of its proper subsets. For example, there are 
“as many”’ positive integers as integers of arbitrary sign, there are “‘as many”’ 
points in the interval (0, 1) as on the whole real line, and so on. This fact 
is characteristic of all infinite sets (and can be used to define such sets), as 
shown by 


THEOREM 4. Every infinite set is equivalent to one of its proper subsets. 


Proof. According to Theorem 3, any infinite set M contains a 
countable subset. Let this subset be 


A= {@1, 4.,... 9 Ans + nabs 
and partition A into two countable subsets 
A, = {@,, Gg, @5,...}, Ag = {@p, Ag, ag, . . .}. 


Obviously, we can establish a one-to-one correspondence between the 
countable sets A and A, (merely let a,<+ a,,_,). This correspondence 
can be extended to a one-to-one correspondence between the sets A U 
(M — A)=M and 4, U (M — A) = M — A, by simply assigning x 
itself to each element x ¢ M — A. But M — A, is a proper subset of 
M. ¥ 


2.4, Uncountability of the real numbers. Several examples of countable 
sets were given in Sec. 2.2, and many more examples of such sets could be 
given. In fact, according to Theorem 2, the union of a finite or countable 
number of countable sets is itself countable. It is now natural to ask whether 
there exist infinite sets which are uncountable. The existence of such sets 
is shown by 
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THEOREM 5, The set of real numbers in the closed unit interval (0, 1] is 
uncountable. 


Proof. Suppose we have somehow managed to count some or all of 
the real numbers in [0, 1], arranging them in a list 


y= 0.4242 se Anrans 
XX, = 0.45120 as aon Sars iery 
Oe teeth, & Be PR a aoe (3) 
On = O.AniAne-- + Anns --s 


where a, is the kth digit in the decimal expansion of the number a,. 
Consider the decimal 


B=O.byby... dy. (4) 


constructed as follows: For b, choose any digit (from 0 to 9) different 
from ay,, for b, any digit different from ag., and so on, and in general 
for b,, any digit different from a,,,. Then the decimal (4) cannot coincide 
with any decimal in the list (3). In fact, 8 differs from a, in at least the 
first digit, from a, in at least the second digit, and so on, since in general 
b,, F% Ayn for all n. Thus no list of real numbers in the interval (0, 1] 
cap include all the real numbers in [0, 1]. 

The above argument must be refined slightly since certain numbers, 
namely those of the form p/10*, can be written as decimals in two ways, 
either with an infinite run of zeros or an infinite run of nines. For 
example, , 

4 = 3; = 0.5000... = 0.4999... ., 


so that the fact that two decimals are distinct does not necessarily mean 
that they represent distinct real numbers. However, this difficulty 
disappears if in constructing 8, we require that ® contain neither zeros 
nor nines, for example by setting b, =2 if a,, = 1 and b, =1 if 


Ann Al. Wl 


Thus the set [0, 1] is uncountable. Other examples of uncountable sets 
equivalent to [0, 1] are 


1) The set of points in any closed interval [a, 5]; 

2) The set of points on the real line; 

3) The set of points in any open interval (a, 5); 

4) The set of all points in the plane or in space; 

5) The set of all points on a sphere or inside a sphere; 

6) The set of all lines in the plane; 

7) The set of all continuous réal functions of one or several variables. 
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The fact that the sets 1) and 2) are equivalent to [0, 1] is proved as in Examples 
1 and 3, pp. 13 and 14, while the fact that the sets 3)-7) are equivalent 
to [0, 1] is best proved indiréctly (cf. Problems 7 and 9). 


2.5. The power of a set. Given any two sets M and N, suppose M and N 
are equivalent. Then M and N are said to have the same power. Roughly 
speaking, “power” is something shared by equivalent sets. If M and N are 
finite, then M and N have the same number of elements, and the concept 
of the power of a set reduces to the usual notion of the number of elements 
in a set. The power of the set Z, of all positive integers, and hence the power 
of any countable set, is denoted by the symbol No, read “aleph null.” A 
set equivalent to the set of real numbers in the interval [0, 1], and hence to 
the set of all real numbers, is said to have the power of the continuum, 
denoted by ¢ (or often by &). 

For the powers of finite sets, i.e., for the positive integers, we have the 
notions of “greater than’’ and “less than,’’ as well as the notion of equality. 
We now show how these concepts are extended to the case of infinite sets. 

Let A and B be any two sets, with powers m(A) and m(B), respectively. 
If A is equivalent to B, then m(A) = m(B) by definition. If A is equivalent 
to a subset of B and if no subset of A is equivalent to B, then, by analogy 
with the finite case, it is natural to regard m(A) as less than m(B) or m(B) as 
greater than m(A). Logically, however, there are two further possibilities : 


a) B has a subset equivalent to A, and A has a subset equivalent to B; 
b) A and B are not equivalent, and neither has a subset equivalent to the 
other. 


In case a), A and B are equivalent and hence have the same power, as shown 
by the Cantor-Bernstein theorem (Theorem 7 below). Case b) would obvi- 
ously show the existence of powers that cannot be compared, but it follows 
from the well-ordering theorem (see Sec. 3.7) that this case is actually impos- 
sible. Therefore, taking both of these theorems on faith, we see that any two 
sets A and B either have the same power or else satisfy one of the rela- 
tions m(A) < m(B) or m(A) > m(B). For example, it is clear that Xy < c 
(why ?). 


Remark. The very deep problem of the existence of powers between X, 
and c¢ is touched upon in Sec. 3.9. As a rule, however, the infinite sets 
encountered in analysis are either countable or else have the power of the 
continuum. 


We have already noted that countable sets are the ‘smallest’ infinite 
sets. It has also been shown that there are infinite sets of power greater 
than that of a countable set, namely sets with the power of the continuum. 
It is natural to ask whether there are infinite sets of power greater than that 
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of the continuum or, more generally, whether there is a “largest”? power. 
These questions are answered by 


THEOREM 6. Given any set M, let 4 be the set whose elements are all 
possible subsets of M. Then the power of @ is greater than the power of 
the original set M. 


Proof. Clearly, the power of the set .@ cannot be less than the power 
m of the original set M, since the “‘single-element subsets”’ (or “‘single- 
tons’’) of M form a subset of .@ equivalent to M. Thus we need only 
show that m and » donot coincide. Suppose a one-to-one correspondence 


awa, boB,... 


has been established between the elements a, b,... of M and certain 
elements A, B,... of @ (ie., certain subsets of M). Then A, B,... 
do not exhaust all the elements of 4, i.e., all the subsets of M. To see 
this, let X be the set of elements of M which do not belong to their 
“associated subsets.”’ More exactly, if a«> A we assign ato X ifa ¢ A, 
but not ifaeA. Clearly, X is a subset of M and hence an element of 4. 
Suppose there is an element x ¢ M such that x<«> X, and consider 
whether or not x belongs to X. Suppose x ¢ X. Then x € X, since, by 
definition, X contains every element not contained in its associated 
subset. On the other hand, suppose x ¢ X. Then x € X, since X con- 
sists precisely of those elements which do not belong to their associated 
subsets. In any event, the element x corresponding to the subset X must 
simultaneously belong to X and not belong to X. But this is impossible! 
It follows that there is no such element x. Therefore no one-to-one cor- 
respondence can be established between the sets M and .4, i.e., 


mAw ff 


Thus, given any set M, there is a set.W of larger power, a set_4* of 
still larger power, and so on indefinitely. In particular, there is no set of 
“largest”’ power. 


2.6. The Cantor-Bernstein theorem. Next we prove an important theorem 
already used in the preceding section: 


THEOREM 7 (Cantor-Bernstein). Given any two sets A and B, suppose 
A contains a subset A, equivalent to B, while B contains a subset B, 
equivalent to A. Then A and B are equivalent. 


Proof. By hypothesis, there is a one-to-one function f mapping A 
into B, and a one-to-one function g mapping B into 4: 
S(A)= BoB, g(B=AC A. 
Therefore 


Ay = gf (A) = gYf(A)) = g(Bi) 
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is a subset of A, equivalent to all of A. Similarly, 


B, = fg(B) = f(g(B)) = f(A.) 


is a subset of B, equivalent to B. Let A, be the subset of A into which 
the mapping gf carries the set A,, and let A, be the subset of A into which 
gf carries A,. More generally, let A,,. be the set into which A, (k = 
1,2,...) is carried by gf. Then clearly 


A> A, > A, > ++? Ay? Agu > °°° 
Setting 
D=NA,, 
k=1 


we can represent A as the following union of pairwise disjoint sets: 


A = (A — Ay) U (A, — Ap) U (An — As) Ut 
U (A, — Any) Urs UD. (5) 


Similarly, we can write A, in the form 
A; = (Ay — Ag) U (Az — As) UU Ay — Ay U-> UD. (6) 
Clearly, (5) and (6) can be rewritten as 
A=DUMUN, (5') 


A,=DUMUM™, (6’) 
where 
M = (A, — A2) U (As — Ag) U-' = 
N= (A—A,) U (A, — As) Uo —, 
Ny = (A, — Ag) U (4g — As) Ue. 


But A — A, is equivalent to 4, — A; (the former is carried into the latter 
by the one-to-one function gf), A, — A is equivalent to A, — A;, and 
so on. Therefore N is equivalent to N,. It follows from the represen- 
tations (5’) and (6’) that a one-to-one correspondence can be set up 
between the sets A and A,. But A, is equivalent to B, by hypothesis. 
Therefore A is equivalent to B. J 


Remark. Here we can even ‘‘afford the unnecessary luxury” of explicitly 
writing down a one-to-one function carrying A into B, i.e., 


g(a) if aeDUM, 


MO fla) if aeDUN 


(sse Figure 7). 


SEC. 2 EQUIVALENCE OF SETS. THE POWER OF A SET 19 





FIGURE 7 


Problem 1, Prove that a set with an uncountable subset is itself un- 
countable. 


Problem 2. Let M be any infinite set and A any countable set. Prove that 
M~MUA. 


Problem 3. Prove that each of the following sets is countable: 


a) The set of all numbers with two distinct decimal expansions (like 
0.5000... and 0.4999... .); 

b) The set of all rational points in the plane (i.e., points with rational 
coordinates); 

c) The set of all rational intervals (i.e., intervals with rational end points); 

d) The set of all polynomials with rational coefficients. 


Problem 4. A number « is called algebraic if it is a root of a polynomial 
equation with rational coefficients. Prove that the set of all algebraic numbers 
is countable. 


Problem 5. Prove the existence of uncountably many transcendental num- 
bers, i.e., numbers which are not algebraic. 


Hint. Use Theorems 2 and 5. 


Problem 6. Prove that the set of all real functions (more generally, 
functions taking values in a set containing at least two elements) defined 
on a set M is of power greater than the power of M. In particular, prove 
that the power of the set of all real functions (continuous and discontinuous) 
defined in the interval [0, 1] is greater than c. 


Hint. Use the fact that the set of all characteristic functions (i.e., functions 
taking only the values 0 and 1) on M is equivalent to the set of all subsets 
of M. 


Problem 7, Give an indirect proof of the equivalence of the closed interval 
[a, 5], the open interval (a, b) and the half-open interval [a, 5) or (a, 5]. 
Hint. Use Theorem 7. 
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Problem 8. Prove that the union of a finite or countable number of sets 
each of power c is itself of power c. 


Problem 9. Prove that each of the following sets has the power of the 
continuum: 


a) The set of all infinite sequences of positive integers; 
b) The set of all ordered n-tuples of real numbers; 
c) The set of all infinite sequences of real numbers. 


Problem 10. Develop a contradiction inherent in the notion of the “set 
of all sets which are not members of themselves.”’ 


Hint. Is this set a member of itself? 


Comment. Thus we will be careful to avoid sets which are ‘‘too big,”’ like 
the “‘set of all sets.” 


3. Ordered Sets and Ordinal Numbers 


3.1. Partially ordered sets. A binary relation R on a set M is said to be a 
partial ordering (and the set M itself is said to be partially ordered) if 


1) R is reflexive (aRa for every ae M); 
2) R is transitive (aRb and bRe together imply aRe); 
3) Ris antisymmetric in the sense that aRb and bRa together imply a = b. 


For example, if M is the set of all real numbers and aRb means a < 3, then 
R is a partial ordering. This suggests writing a < b (or equivalently b > a) 
instead of aRb whenever R is a partial ordering, and we will do so from now 
on. Similarly, we writea << bifa<b,aAbandb>aifb>a,b¥fa. 

The following examples give some idea of the generality of the concept 
of a partial ordering: 


Example I. Any set M can be partially ordered in a trivial way by setting 
a < bif and only ifa = bd. 


Example 2. Let M be the set of all continuous functions f, g, . . . defined 
in a closed interval [«, 8]. Then we get a partial ordering by setting f < g 
if and only if f(4) < g(t) for every te [«, B]. 


Example 3. The set of all subsets M,, M,,... is partially ordered if 
M, < M, means that M, < M,. 


Example 4. The set of all integers greater than | is partially ordered if 
a < b means that “db is divisible by a.” 
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An element a of a partially ordered set is said to be maximal if a < b 
implies b = a and minimal if b < a implies b = a. Thus in Example 4 every 
prime number (greater than 1) is a minimal element. 


3.2. Order-preserving mappings. Isomorphisms. Let M and M’ be any 
two partially ordered sets, and let f be a one-to-one mapping of M onto M’. 
Then fis said to be order-preserving if a < b (where a, b € M) implies f(a) < 
J (®) (in M’), An order-preserving mapping f such that f(a) < f(b) implies 
a < b is called an isomorphism. In other words, an isomorphism between 
two partially ordered sets M and M’ is a one-to-one mapping of M onto M’ 
such that f(a) < f(d) if and only if a < b. Two partially ordered sets M 
and Mare said to be isomorphic (to each other) if there exists an isomorphism 
between them. 


Example. Let M be the set of positive integers greater than 1 partially 
ordered as in Example 4, Sec. 3.1, and let M’ be the same set partially ordered 
in the natural way, i.e., in such a way that a < b if and only if b — a is 
nonnegative. Then the mapping of M onto M’ carrying every integer n 
into itself is order-preserving, but not an isomorphism. 


Isomorphism between partially ordered sets is an equivalence relation 
as defined in Sec. 1.4, being obviously reflexive, symmetric and transitive. 
Hence any given family of partially ordered sets can be partitioned into 
disjoint classes of isomorphic sets.’ Clearly, two isomorphic partially 
ordered sets can be regarded as identical in cases where it is the structure 
of the partial ordering rather than the specific nature of the elements of the 
sets that is of interest. 


3.3. Ordered sets. Order types. Given two elements a and b of a partially 
ordered set M, it may turn out that neither of the relations a< borb<a 
holds. In this case, a and b are said to be noncomparable. Thus, in general, 
the relation < is defined only for certain pairs of elements, which is why M@ 
is said to be partially ordered. However, suppose M has no noncomparable 
elements. Then M is said to be ordered (synonymously, simply or linearly 
ordered). {n other words, a set M is ordered if it is partially ordered and if, 
given any two distinct elements a, b € M, either a < b or b <a. Obviously, 
any subset of an ordered set is itself ordered. 

Each of the sets figuring in Examples 1-4, Sec. 3.1 is partially ordered, 
but not ordered. Simple examples of ordered sets are the set of all positive 
integers, the set of all rational numbers, the set of all real numbers in the 


7 Note that we avoid talking about the “family of ail partially ordered sets” (recall 
Problem 10, p. 20). 
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interval [0, 1], and so on (with the usual relations of ‘greater than’’ and “‘less 
than’). 

Since an ordered set is a special kind of partially ordered set, the concepts 
of order-preserving mapping and isomorphism apply equally well to ordered 
sets. Two isomorphic ordered sets are said to have the same (order) type. 
Thus “type’’ is something shared by all isomorphic ordered sets, just as 
“power’’ is something shared by all equivalent sets (considered as “plain”’ 
sets, without regard for possible orderings). 

The simplest example of an ordered set is the set of all positive integers 
1,2,3,... arranged in increasing order, with the usual meaning of the 
symbol <. The order type of this set is denoted by the symbol w. Two iso- 
morphic ordered sets obviously have the same power (an isomorphism is a 
one-to-one correspondence). Thus it makes sense to talk about the power 
corresponding to a given order type. For example, the power &, corresponds 
to the order type w. The converse is not true, since a set of a given power can 
in general be ordered in many different ways. It is only in the finite case that 
the number of elements in a set uniquely determines its type, designated by 
the same symbol n as the number of elements in the set. For example, 
besides the ‘“‘natural’’ order type w of the set of positive integers, there is 
another order type corresponding to the sequence 


| Oe as ee Pe 


where odd and even numbers are separately arranged in increasing order, 
but any odd number precedes any even number. It can be shown that the 
number of distinct order types of a set of power , is infinite and in fact 
uncountable. 


3.4. Ordered sums and products of ordered sets. Let M, and M, be two 
ordered sets of types 6, and 6,, respectively. Then we can introduce an 
ordering in the union M, U M, of the two sets by assuming that 


1) a and b have the same ordering as in M, ifa, be M,; 
2) aand 5 have the same ordering as in M, if a, be M,; 
3)a<bifaeM,,beM, 


(verify that this is actually an ordering of M, U M,). The set M, UM, 
ordered in this way is called the ordered sum of M, and M,, denoted by 
M, + M,. Note that the order of terms matters here, i.e., in general M, + M, 
is not isomorphic to M, + M2. More generally, we can define the ordered 
sum of any finite number of ordered sets by writing (cf. Problem 6) 


M,+ M,+ M;= (M, + M,) + Msg, 
M,+ M,+ Ms+ M,=(M, + M.+ M3) + My, 
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and so on. By the ordered sum of the types 6, and 6,, denoted by 8, + 92, 
we mean the order type of the set M, + M,. 


Example. Consider the order types w and n. It is easy to see that 
n+w-=o. In fact, if finitely many terms are written to the left of the 
sequence 1,2,...,k,..., we again get a set of the same type (why?). 
On the other hand, the order type w + 2, i.e., the order type of the set® 


$1 Ds onary Ryn Sgt iy Gas aco, ahs 
is obviously not equal to w. 


Again let-M, and M, be two ordered sets of types 0, and 9,, respectively. 
Suppose we replace each element of M, by a “replica’’ of the set M,. Then 
the resulting set, denoted by M,- Mg, is called the ordered product of M, 
and M,. More exactly, M,- M, is the set of all pairs (a, 5) where ae M,, 
5 & My, ordered in such a way that 


1) (a,, by) < (@e, 52) if 5; < b, (for arbitrary a,, ag); 
2) (a,, b) < (ag, b) if ay < a. 


Note that the order of factors matters here, i.e., in general M,- M, is not 
isomorphic to M, - M,. The ordered product of any finite number of ordered 
sets can be defined by writing (cf. Problem 6) 


M,: M,° M; = (M,- M2): M3, 
M,: M,* M3- My = (M,° Mz- M3)° My, 
and so on. By the ordered product of the types 6, and 8,, denoted by 8, - 95, 
we mean the order type of the set M, > Mp. 
3.5. Well-ordered sets. Ordinal numbers. A key concept in the theory of 
ordered sets is given by 


DEFINITION 1. An ordered set M is said to be well-ordered if every 
nonempty subset A of M has a smallest (or “‘first’’) element, i.e., an element 
w such that » <a for every ae A. 


Example 1. Every finite ordered set is obviously well-ordered. 


Example 2. Every nonempty subset of a well-ordered set is itself well- 
ordered. 


Example 3. The set M or rational numbers in the interval [0, 1] is ordered 
but not well-ordered. It is true that M has a smallest element, namely the 


® Here we use the same curly bracket notation as in Sec. 1.1, but the order of terms 
is now crucial. 
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number 0, but the subset of M consisting of all positive rational numbers 
has no smallest element. 


DEFINITION 2. The order type of a well-ordered set is called an ordinal 
number or simply an ordinal.® If the set is infinite, the ordinal is said to be 
transfinite. 


Example 4. The set of positive integers 1,2,...,k,... arranged in 
increasing order is well-ordered, and hence its order type w is a (transfinite) 
ordinal. The order type w + n of the set 


{1,2,...,k, 06.50), Gg,- ++ 4 An} 
is also an ordinal. 


Example 5. The set 
{...,—-k,...,—-3, —2, —1} (1) 


is ordered but not well-ordered. It is true that any nonempty subset 4 of 
(1) has a largest element (i.e., an element v such that a < v for every a € A), 
but in general A will not have a smallest element. In fact, the set (1) itself 
has no smallest element. Hence the order type of (1), denoted by w*, is not 
an ordinal number. 


THEOREM 1. The ordered sum of a finite number of well-ordered sets 
M,, Mg, ..., M, is itself a well-ordered set. 


Proof. Let M be an arbitrary subset of the ordered sum M, + M, + 
+++ + M,, and let M;, be the first of the sets M,, M,,..., M, (namely 
the set with smallest index) containing elements of M. Then M1 M, 
is a subset of the well-ordered set M,,, and as such has a smallest element 
u. Clearly u is the smallest element of M itself. fj 


CoROLLarY. The ordered sum of a finite number of ordinal numbers is 
itself an ordinal number. 


Thus new ordinal numbers can be constructed from any given set of 
ordinal numbers. For example, starting from the positive integers (i.e., the 
finite ordinal numbers) and the ordinal number w, we can construct the new 
ordinal numbers 


o+n, wo+o0, O+toao+n, o+o+o, 
and so on. 


THEOREM 2, The ordered product of two well-ordered sets M, and M, 
is itself a well-ordered set. 


* This is a good place to point out that the terms ‘“‘cardinal number” and ‘‘power”’ 
(of a set) are synonymous. 
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Proof. Let M be an arbitrary subset of M,- M,, so that M is a set of 
ordered pairs (a, b) witha € M,,b € M,. The set of all second elements b 
of pairs in M is a subset of M,, and as such has a smallest element since 
M, is well-ordered. Let b, denote this smallest element, and consider 
all pairs of the form (a, b,) contained in M. The set of all first elements 
a of these pairs is a subset of M,, and as such has a smallest element 
since M, is well-ordered. Let a, denote this smallest element. Then the 
pair (a, 5,) is clearly the smallest element of M. fj 


CorOLLarY 1. The ordered product of a finite number of well-ordered 
sets is itself a well-ordered set. 


CoROLLaRY 2. The ordered product of a finite number of ordinal num- 
bers is itself an ordinal number. 


Thus it makes sense to talk about the ordinal numbers 
O°n, wo, o2-7n, w, 
and so on, It is also possible to define such ordinal numbers as?° 


a) 
Qa ca) 
O°, O° 5.06 


3.6. Comparison of ordinal numbers. If n, and n, are two finite ordinal 
numbers, then they either coincide or else one is larger than the other. As 
we now show, the same is true of transfinite ordinal numbers. We begin by 
observing that every element a of a well-ordered set M determines an (initial) 
section P, the set of all x e M such that x < a, and a remainder Q, the set 
of all x € M such that x > a. Given any two ordinal numbers « and 8, let 
M and N be well-ordered sets of order type « and 8, respectively. Then we 
say that 


1) « = B if M and N are isomorphic; 
2) «a < B if M is isomorphic to some section of N; 
3) « > Bif Nis isomorphic to some section of M 


(note that this definition makes sense for finite « and 8). 


Lemma. Let f be an isomorphism of a well-ordered set A onto some 
subset B < A, Then f(a) > a for allae A. 


Proof. If there are elements a € A such that f(a) < a, then there is a 
least such element since A is well-ordered. Let a, be this element, and 
let by =f (a). Then by < ag, and hence f (by) < f(a) = bo since fis an 
isomorphism. But then a, is not the smallest element such that f(a) < a. 
Contradiction! ff 


10 See e.g., A. A. Fraenkel, Abstract Set Theory, third edition, North-Holland Pub- 
lishing Co., Amsterdam (1966), pp. 202-208. 
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Tt follows from the lemma that a well-ordered set A cannot be iso- 
morphic to any of its sections, since if A were isomorphic to the section 
determined by a, then clearly f(a) < a. In other words, the two relations 


a=, a<6 
are incompatible, and so are 

a= 6B, a> ®. 
Moreover, the two relations 

a< 6B, a>B 


are incompatible, since otherwise we could use the transitivity to deduce 
a <a, which is impossible by the lemma. Therefore, if one of the three 
relations 

a<B, «= 8B, «>B (2) 


holds, the other two are automatically excluded. We must still show that 
one of the relations (2) always holds, thereby proving that any two ordinal 
numbers are comparable. 


THEOREM 3. Two given ordinal numbers « and 8 satisfy one and only 
one of the relations 
a<B, «= 6, « > B. 


Proof. Let W(«) be the set of all ordinals <a. Any two numbers 
y and y’ in W(«) are comparable! and the corresponding ordering of 
W(«) makes it a well-ordered set of type «. In fact, if a set 


A={...,a,...,6,...} 


is of type «, then by definition, the ordinals less than « are the types of 
well-ordered sets isomorphic to sections of A. Hence the ordinals them- 
selves are in one-to-one correspondence with the elements of A. In other 
words, the elements of a set of type « can be numbered by using the 
ordinals less than «: 


A = {Q1, Qp,...5ny-. hs 


Now let « and 8 be any two ordinals. Then W(«) and W(6) are well- 
ordered sets of types « and B, respectively. Moreover, let C= AB 
be the intersection of the sets A and B, i.e., the set of all ordinals less than 
both « and 8. Then C is well-ordered, of type y, say. We now show that 
y <a. If C =A, then obviously y = «. On the other hand, if C 4 A, 
then C is a section of A and hence y < «. In fact, let €€C, nEA —C. 
Then € and y are comparable, i.e., either § < nor& > yn. Butyn< &<a 


11 Recall the meaning of y < «, y’ < «, and use the fact that a section of a section of 
a well-ordered set is itself a section of a ‘well-ordered set. 
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is impossible, since then 1 €C. Therefore & < y and hence C is a 
section of A, which implies y < «. Moreover, y is the first element of 
the set A — C. Thus y < «, as asserted, and similarly y < 6. The case 
y <4, y< is impossible, since then ye A—C, ye B—C. But 
then y ¢C on the one hand and y¢4AMB=C on the other hand. 
It follows that there are only three possibilities 


y=u, y=, «=68, 
y=", y<B, «2<8, 
y<«, y=, «>8, 


i.e., a and @ are comparable. j 


THEOREM 4. Let A andB be well-ordered sets. Theneither A is equivalent 
to B or one of the sets is of greater power than the other, i.e., the powers 
of A and B are comparable. 


Proof. There is a definite power corresponding to each ordinal. But 
we have just seen that ordinals are comparable, and so are the corre- 
sponding powers (recall the definition of inequality of powers given in 
Sec. 2.5). ff 


3.7. The well-ordering theorem, the axiom of choice and equivalent asser- 
tions. Theorem 4 shows that the powers of two well-ordered sets are always 
comparable. In 1904, Zermelo succeeded in proving the 


WELL-ORDERING THEOREM. Every set can be well-ordered. 


It follows from the well-ordering theorem and Theorem 5 that the powers of 
two arbitrary sets are always comparable, a fact already used in Sec. 2.5. 
Zermelo’s proof, which will not be given here," rests on the following basic 


AXIOM OF CHOICE. Given any set M, there is a ‘‘choice function’’ f such 
that f (A) is an element of A for every nonempty subset A < M. 


We will assume the validity of the axiom of choice without further ado. 
In fact, without the axiom of choice we would be severely hampered in 
making set-theoretic constructions. However, it should be noted that from 
the standpoint of the foundations of set theory, there are still deep and 
controversial problems associated with the use of the axiom of choice. 

There are a number of assertions equivalent to the axiom of choice, i.e., 
assertions each of which both implies and is implied by the axiom of choice. 
One of these is the well-ordering theorem, which obviously implies the axiom 
of choice. In fact, if an arbitrary set M can be well-ordered, then, by merely 
choosing the “‘first’’ element in each subset A < M, we get the function f(A) 


12 A. A. Fraenkel, op. cit., pp. 222-227. 
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figuring in the statement of the axiom of choice. On the other hand, the 
axiom of choice implies the well-ordering theorem, as already noted without 
proof. a 

To state further assertions equivalent to the axiom of choice, we need 
some more terminology: 


” DEFINITION 3. Let M be a partially ordered set, and let A be any subset 
of M such that a and b are comparable for every a,b € A. Then A is called 
achain (in M). A chain C is said to be maximal if there is no other chain C’ 
in M containing C as a proper subset. 


DEFINITION 4. An element a of a partially ordered set M is called an 
upper bound of a subset M' < M ifa' < a for every a’ € M’. 


We now have the vocabulary needed to state two other assertions equiv- 
alent to the axiom of choice: 


HAUSDORFF’S MAXIMAL PRINCIPLE. Every chain in a partially ordered 
set M is contained in a maximal chain in M. 


ZORN’S LEMMA. If every chain in a partially ordered set M has an upper 
bound, then M contains a maximal element. 


For the proof of the equivalence of the axiom of choice, the well-ordering 
theorem, Hausdorff’s maximal principle and Zorn’s lemma, we refer the 
reader elsewhere.* Of these various equivalent assertions, Zorn’s lemma is 
perhaps the most useful. 


3.8. Transfinite induction. Mathematical propositions are very often 
proved by using the following familiar 


THEOREM 4 (Mathematical induction). Given a proposition P(n) formu- 
lated for every positive integer n, suppose that 


1) P(1) is true; 

2) The validity of P(k) for all k < n implies the validity of P(n + 1). 
Then P(n) is true for alln = 1,2,... 

Proof. Suppose P(n) fails to be true for alln = 1,2,..., and let 
n, be the smallest integer for which P(n) is false (the existence of n, 
follows from the well-ordering of the positive integers). Clearly n, > 1, 


so that n, — 1 is a positive integer. Therefore P(n) is valid for all 
k <n, — | but not for n,. Contradiction! J 


Replacing the set of all positive integers by an arbitrary well-ordered set, 


13 See e.g., G. Birkhoff, Lattice Theory, third edition, American Mathematical Society, 
Providence, R.I. (1967), pp. 205-206. 
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we get 


THEOREM 4’. (Transfinite induction). Given a well ordered set A, let 
P(a) be a proposition formulated for every element ae A. Suppose that 


1) P(a) is true for the smallest element of A; 
2) The validity of P(a) for all a < a* implies the validity of P(a*). 


Then P(a) is true for allae A. 


Proof. Suppose P(a) fails to be true tor all ae A. Then P(q) is false 
for all a in some nonempty subset 4* ¢ A. By the well-ordering, A* 
has a smallest element a*. Therefore P(a) is valid for all a < a* but 
not for a*. Contradiction! ff 


Remark. Since any set can be well-ordered, by the well-ordering theorem, 
transfinite induction can in principle be applied to any set M whatsoever. 
In practice, however, Zorn’s lemma is a more useful tool, requiring only that 
M be partially ordered. 


3.9. Historical remarks. Set theory as a branch of mathematics in its 
own right stems from the pioneer work of Georg Cantor (1845-1918). 
Originally met with disbelief, Cantor’s ideas subsequently became widespread. 
By now, the set-theoretic point of view has become standard in the most 
diverse fields of mathematics. Basic concepts, like groups, rings, fields, linear 
spaces, etc. are habitually defined as sets of elements of an arbitrary kind 
obeying appropriate axioms. 

Further development of set theory led to a number of logical difficulties, 
which naturally gave rise to attempts to replace “‘naive’’ set theory by a more 
rigorous, axiomatic set theory. It turns out that certain set-theoretic questions, 
which would at first seem to have “yes’’ or “‘no”’ answers, are in fact of a 
different kind. Thus it was shown by Gédel in 1940 that a negative answer 
to the question “Is there an uncountable set of power less than that of the 
continuum”’ is consistent with set theory (axiomatized in a way we will not 
discuss here), but it was recently shown by Cohen that an affirmative answer 
to the question is also consistent in the same sense! 


Problem 1. Exhibit both a partial ordering and a simple ordering of the 
set of all complex numbers. 


Problem 2. What is the minimal element of the set of all subsets of a 
given set X, partially ordered by set inclusion. What is the maximal element? 


Problem 3. A partially ordered set M is said to be a directed set if, given 
any two elements a, b € M, there is an element c € M such thata<c,b<ce. 
Are the partially ordered sets in Examples 1-4, Sec. 3.1 all directed sets? 


14 For example, the set of all transfinite ordinals less than a given ordinal. 
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Problem 4. By the greatest lower bound of two elements a and b of a 
partially ordered set M, we mean an element ce M such thatc < a,c <b 
and there is no element de M such that c<d<a,d< b. Similarly, by 
the Jeast upper bound of a and b, we mean an element c € M such that a < c¢, 
6 < c and there is no element de M such thata<d<c,b<d.Bya 
lattice is meant a partially ordered set any two element of which have both 
a greatest lower bound and a least upper bound. Prove that the set of all 
subsets of a given set X, partially ordered by set inclusion, is a lattice. What 
is the set-theoretic meaning of the greatest lower bound and least upper 
bound of two elements of this set? 


Problem 5. Prove that an order-preserving mapping of one ordered set 
onto another is automatically an isomorphism. 


Problem 6. Prove that ordered sums and products of ordered sets are 
associative, i.e., prove that if M/,, M, and M; are ordered sets, then 


(M, + Mz) + Mz = M, + (Mz + M3), (Mi: M,)* Mz = My: (M,° Ms), 
where the operations + and - are the same as in Sec. 3.4. 


Comment. This allows us to drop parentheses in writing ordered sums 
and products. 


Problem 7. Construct well-ordered sets with ordinals 
o+n, wo+o, o+ao+n, o+a0+ao,... 

Show that the sets are all countable. 

Problem 8. Construct well-ordered sets with ordinals 

Orn, ow, wen, w3,... 

Show that the sets are all countable. 

Problem 9. Show that 

O+o=o:2, o+to+two=w-3,... 


Problem 10. Prove that the set W(«) of all ordinals less than a given 
ordinal « is well-ordered. 


Problem 11. Prove that any nonempty set of ordinals is well-ordered. 


Problem 12. Prove that the set M of all ordinals corresponding to a 
countable set is itself uncountable. 


Problem 13. Let &, be the power of the set M in the preceding problem. 
Prove that there is no power m such that X) << m <j. 
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4. Systems of Sets?® 


4.1. Rings of sets. By a system of sets we mean any set whose elements 
are themselves sets. Unless the contrary is explicitly stated, the elements 
of a given system of sets will be assumed to be certain subsets of some fixed 
set X. Systems of sets will usually be denoted by capital script letters like 
&, SF, etc. Our chief interest will be systems of sets which have certain 
closure properties under the operations introduced in Sec. 1.1. 


DEFINITION |. A nonempty system of sets & is called a ring (of sets) if 
AABEBandA NBER wheneverAc R&R, BES. 
Since 
AUB=(AABA(A NB), 
A—B=AA(AQNB), 
we also have AUBe & and A —-Be & whenever ACH, BES. 
Thus a ring of sets is a system of sets closed under the operations of 
taking unions, intersections, differences, and symmetric differences. 


Clearly, a ring of sets is also closed under the operations of taking finite 
unions and intersections: 


n n 
UA,  NA;. 
k=1 k=1 


A ring of sets must contain the empty set @, since A —-A= @. 
A set Eis called the unit of a system of sets Y if Ee Y and 


ANE=A 


for every Ac ¥. Clearly E is unique (why?). Thus the unit of is 
just the maximal set of Y, ie., the set containing all other sets of Y. 
A ring of sets with a unit is called an algebra (of sets). 


Example 1. Given a set A, the system 4(A) of all subsets of A is an 
algebra of sets, with unit E = A. 


Example 2. The system {@, A} consisting of the empty set @ and any 
nonempty set A is an algebra of sets, with E = A. 


Example 3. The system of all finite subsets of a given set A is a ring of 
sets. This ring is an algebra if and only if A itself is finite. 


Example 4. The system of all bounded subsets of the real line is a ring of 
sets, which does not contain a unit. 


18 The material in this section need not be read now, since it will not be needed until 
Chapter 7. 
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THEOREM 1. The intersection 
eo a= Nn Ra 
of any set of rings is itself a ring. 
Proof. An immediate consequence of Definition 1. J 


THEOREM 2. Given any nonempty system of sets /, there is a unique 
ring P containing S and contained in every ring containing S. 


Proof. \f F exists, then clearly 7 is unique (why?). To prove the 
existence of F, consider the union 
X=UA 
Aey 
of all sets 4 belonging to ¥ and the ring -4(X) of all subsets of X. Let 
= be the set of all rings of sets contained in. @(X) and containing S. 
Then the intersection 
P=NZB 
RE 
of all these rings clearly has the desired properties. In fact, FY obviously 
contains Y. Moreover, if Z* is any ring containing /, then the 
intersection Z = Z* \.M(X) isaringin Land hencePc Ac K*, 
as required. The ring F is called the minimal ring generated by the system 
YF, and will henceforth be denoted by ZY). | 


Remark. The set #(X) containing #(S) has been introduced to avoid 
talking about the “‘set of all rings containing .”? Such concepts as “the 
set of all sets,’’ “the set of all rings,’’ etc. are inherently contradictory and 
should be avoided (recall Problem 10, p. 20). 


4.2. Semirings of sets. The following notion is more general than that 
of a ring of sets and plays an important role in a number of problems (par- 
ticularly in measure theory): 


DEFINITION 2. A system of sets S is called a semiring (of sets) if 


1) F contains the empty set 2; 

2) AN BES whenever AES, BES; 

3) If F contains the sets A and A, < A, then A can be represented 
as a finite union 


Hoel As (1) 


k=1 


of pairwise disjoint sets of S, with the given set A, as its first term. 


Sec. 4 SYSTEMS OF SETS 33 


Remark. The representation (1) is called a finite expansion of A, with 
respect to the sets A,, Ay,..., Ap. 


Example 1. Every ring of sets Z is a semiring, since if @ contains A and 
A, < A, then A = A, U A, where A, = A — Al E &. 

Example 2. The set ¥ of all open intervals (a, 6), closed intervals [a, 5] 
and half-open intervals [a, 6), (a, 6], including the “empty interval” (a, a) = 
@ and the single-element sets [a, a] = {a}, is a semiring but not a ring. 


Lemma 1. Suppose the sets A, Ay, ..., A,, where Ay, ..., A, are 
pairwise disjoint subsets of A, all belong to a semiring S. Then there is a 
finite expansion 


A=UA, (s > n) 
k=1 


with A,,..., A, as its first n terms, where A, € S, A, OA, = @ forall 
k,l=1,...,n. 


Proof. The lemma holds for n = 1, by the definition of a semiring. 
Suppose the lemma holds for n = m, and consider m + 1 sets 4y,..., 
Am; Am+1 Satisfying the conditions of the lemma. By hypothesis, 

A=A,U+++UA, UB, U'+++ UB, 
where the sets 4,,..., Am, B,,...,B, are pairwise disjoint subsets of 
A, all belonging to S. Let 
Bu =, Amit ia) B,. 
By the definition of a semiring, 
BB U Bag 
where the sets B,; (j= 1,...,7,) are pairwise disjoint subsets of B,, 
all belonging to Y. But then it is easy to see that 


p r 
A=ApUS U Ag U Ang UU (U2.,) 
q=1 \j=2 
i.e., the lemma is true form = m + 1. The proof now follows by mathe- 
matical induction. § 


LEMMA 2. Given any finite system of sets A,,..., A,, belonging to a 


semiring S, there is a finite system of pairwise disjoint sets B,,...,B, 
belonging to S such that every A, has a finite expansion 


A,= UB, (k=1,...,27) 


seMy, 
with respect to certain of the sets B,.*® 


16 Here M,, denotes some subset of the set {1, 2,..., t}, depending on the choice of k, 
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Proof. The lemma is trivial for n = 1, since we need only set t = 1, 
B, = A, Suppose the lemma is true for n = m, and consider a system 
of sets 4,,..., Am Anyy in SF, Let B,,..., B, be sets of & satisfying 
the conditions of the lemma with respect to 4,,...,A,, and let 


By = Amst O Bs. 
Then, by Lemma 1, there is an expansion 
t q 
ee (U Bu) U (U Bs) (Ble Y), 
s=1 p=1 
while, by the very definition of a semiring, there is an expansion 
B, = By U By U'ss U By, (B,; € SF). 
It is easy to see that 
Ts 
A= (U2..) st Lo 
seM;, \j=1 


for some suitable M,. Moreover, the sets B,,, B’ are pairwise disjoint. 
Hence the sets B,,, B) satisfy the conditions of the lemma with respect 
to 4y,...,Am, Am41. The proof now follows by mathematical induc- 
tion. ff 


4.3. The ring generated by a semiring. According to Theorem 1, there is 
a unique minimal ring A(S) generated by a given system of sets . The 
actual construction of Z(.) is quite complicated for arbitrary 7. However, 
the construction is completely straightforward if is a semiring, as shown 
by 
THEOREM 3. If S is a semiring, then R(S) coincides with the system 
& of all sets A which have finite expansions 


n 
A=UA, 
k=1 
with respect to the sets A, € S. 
Proof. First we prove that @ is aring. Let A and B be any two sets in 
2. Then there are expansions 


B=UB, (Be). 
j=1 


Since ¥ is a semiririg, the sets 


Cy = A, OB; 
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also belong to Y. By Lemma 1, there are expansions 


n TM] 
A; = (U Cu) U (U Da) (Dix € SF), 
j=l k=l 
m 85 (2) 
B= (U Cu) U (U Ex) (Ene S). 
i=l t=1 
It follows from (2) that 4 © Band A A B have the expansions 
ANB=UQC,, 


4aB=(U Da) v (U En), 


and hence belong to 2. Therefore % is a ring. The fact that 2 is the 
minimal ring generated by # is obvious. | 


4.4, Borel algebras. There are many problems (particularly in measure 
theory) involving unions and intersections not only of a finite number of 
sets, but also of a countable number of sets. This motivates the following 
concepts: 


DEFINITION 3. A ring of sets is called a o-ring if it contains the union 


foe) 
S=UA, 
n=1 
whenever it contains the sets Ay, Ag,...,An,.... A o-ring with a unit 


E is called a o-algebra. 


DEFINITION 4. A ring of sets is called a 8-ring if it contains the inter- 
section 


D=f1A, 
n=1 
whenever it contains the sets Ay, Az,...,A,,... . A d-ring with a unit 


E is called a 8-algebra. 
THEOREM 4. Every o-algebra is a 8-algebra and conversely. 
Proof. An immediate consequence of the “dual’”’ formulas 
UA,=E-—N(E-A,), 
N4,=E—-—U(eE—aA,). I 


The term Borel algebra (or briefly, B-algebra) is often used to denote 
a o-algebra (equivalently, a 3-algebra). The simplest example of a B-algebra 
is the set of all subsets of a given set A. 
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Given any system of sets /, there always exists at least one B-algebra 

containing SY. In fact, let 
X=UA4. 
ACP 

Then the system & of all subsets of X is clearly a B-algebra containing S. 

If @ is any B-algebra containing and if E is its unit, then every 
Aé F is contained in E and hence 

X=UACE, 
ACY 

A B-algebra & is called irreducible (with respect to the system S) if X¥ = E, 
ie,, an irreducible B-algebra is a B-algebra containing no points that do 
not belong to one of the sets A € Y. In every case, it will be enough to 
consider only irreducible B-algebras. 


Theorem 2 has the following analogue for irreducible B-algebras: 


THEOREM 5. Given any nonempty system of sets S, there is a unique 
irreducible’ B-algebra B(S) containing S and contained in every 
B-algebra containing S. 


Proof. The proof is virtually identical with that of Theorem 2. The 
B-algebra B(S) is called the minimal B-algebra generated by the system 
F or the Borel closure of S. | 


Remark. An important role is played in analysis by Borel sets or B-sets. 
These are the subsets of the real line belonging to the minimal B-algebra 
generated by the set of all closed intervals [a, 5]. 


Problem 1, Let X be an uncountable set, and let & be the ring consisting 
of all finite subsets of X and their complements. Is Z a o-ring? 


Problem 2. Are open intervals Borel sets? 


Problem 3. Let y = f(x) be a function defined on a set M and taking 
values in a set N. Let.@ be a system of subsets of M, and let f(.@) denote 
the system of all images f(A) of sets A €.@. Moreover, let W be a system 
of subsets of N, and let f-1¢./) denote the system of all preimages f~1(B) 
of sets Be WV. Prove that 

a) If Y is a ring, so is f(Y); 

b) If / is an algebra, so is f-"(./); 

c) If W is a B-algebra, so is f-(V); 

d) RIMM) =fUAY)):; 

e:) BYU) =f" AY). 

Which of these assertions remain true if / is replaced by.# and f— by f? 


17 More exactly, irreducible with respect to /. 
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METRIC SPACES 


5. Basic Concepts 


5.1. Definitions and examples. One of the most important operations in 
mathematical analysis is the taking of limits. Here what matters is not so 
much the algebraic nature of the real numbers,} but rather the fact that 
distance from one point to another on the real line (or in two or three- 
dimensional space) is well-defined and has certain properties. Roughly 
speaking, a metric space is a set equipped with a distance (or “metric”’) 
which has these same properties. More exactly, we have 


DEFINITION 1. By a metric space is meant a pair (X, p) consisting of 
a set X and a distance op, i.e., a single-valued, nonnegative, real function 
(x, y) defined for all x, y © X which has the following three properties: 


1) ex, y) = 0 if and only if x = y; 
2) Symmetry: p(x, y) = EQ, x); 
3) Triangle inequality: e(x,z) < p(x, y) + ely, 2). 


We will often refer to the set X as a “‘space”’ and its elements x, y,... as 
“points.’’ Metric spaces are usually denoted by a single letter, like 


R= (xX; e)> 


or even by the same letter X as used for the underlying space, in cases where 
there is no possibility of confusion. 


1Qe., the fact that the real numbers form a field. 
37 
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Example 1. Setting 
0 if x=y, 
1 if xy, 


where x and y are elements of an arbitrary set X, we obviously get a metric 
space, which might be called a “discrete space’? or a “‘space of isolated 
points.” 


e(x, y) = 


Example 2. The set of all real numbers with distance 
e(x, y) = |x — yl 
is a metric space, which we denote by R?. 
Example 3. The set of all ordered n-tuples 
X= (%1, X2,..- 5 X,) 
of real numbers x1, X2,..., X,, with distance 


Ay) = [Sc — yy), a) 


is a metric space denoted by R” and called n-dimensional Euclidean space 
(or simply Euclidean n-space). The distance (1) obviously has properties 
1) and 2) in Definition 1. Moreover, it is easy to see that (1) satisfies the 
triangle inequality. In fact, let 
x= (X15 Xg. 6+. »Xn)s lS (1, Yar--- 2Yn ? a (245-295-000 25°25) 
be three points in R”, and let 
Ay = Xn — Vas Op = Vu — Ze (k =1,...,n). 

Then the triangle inequality takes the form 


| Se — 2%) < Hi pxcs — y+ | So — z,)*, (2) 


or equivalently 





Y(a, + b,)* < | an+ | “be. (2) 
kal k=1 NV f=1 
It follows from the Cauchy-Schwarz inequality 
n 2 n n 
(Zevh) <3 at > 08 @) 
k=l k=l k=1 


(see Problem 2) that 


n 


D(a +b)? = yu at 2d a,b, +> bh 
k=l kal k=l 


k=1 
n n n n n n 2 
<> a+2/3 a, YOR + DHE = (/Set+ 309 
k=1 k=1 k=1 k=1 k=1 k=1 


Taking square roots, we get (2’) and hence (2). 
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Example 4. Take the same set of ordered n-tuples x = (x,,..., X,) as in 
the preceding example, but this time define the distance by the function 


pilx, ») = Ste — yl 4) 


It is clear that (4) has all three properties of a distance figuring in Definition 
1. The corresponding metric space will be denoted by R?. 


Example 5. Take the same set as in Examples 3 and 4, but this time 
define distance between two points x = (x;,...,*,) and y= (1,.- + 5 Vn) 
by the formula 


po(x, y) = max [xy — Yul (5) 
Then we again get a metric space (verify all three properties of the distance). 
This space, denoted by R@, is often as useful as the Euclidean space R”. 


Remark. The last three examples show that it is sometimes important 
to use a different notation for a metric space than for the underlying set of 
points in the space, since the latter can be “metrized”’ in a variety of different 
ways. 


Example 6. The set C,,,, of all continuous functions defined on the 
closed interval [a, 6], with distance 


(fg) = max lf — a(2)l (6) 


is a metric space of great importance in analysis (again verify the three 
properties of distance). This metric space and the underlying set of “points”’ 
will both be denoted by the symbol C,, ,,. Instead of Ci, ,,, we will often 
write just C. A space like C,,,, is often called a “function space,” to 
emphasize that its elements are functions. 


Example 7. Let-/, be the set of all infinite sequences? 
X= (1, Xa,.-+ Xess) 


of real numbers x,, X2,..., X,,... satisfying the convergence condition 
om 2 
> < oO, 


2 The infinite sequence with general term x, can be written as {x,} or simply as 
X1,X,..+,Xz,... (this notation is familiar from calculus). It can also be written in 
“point notation” as x = (%1, X2,...,%z,-..), Le., as an ‘ordered «-tuple” generalizing 
the notion of an ordered n-tuple. (In writing {x,} we have another use of curly brackets, 
but the context will always prevent any confusion between the sequence {x,} and the set 
whose only element is x.) 
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where distance between points is defined by 


e(x; 9) = : pxen — y,)*. (7) 


Clearly (7) makes sense for all x, y €/,, since it follows from the elementary 
inequality 

(te + Ye)? < 200k + ye) 
that convergence of the two series 


implies that of the series 

>On — y,)°. 

k=1 
At the same time, we find that if the points (x,,%.,...,X,,...) and 
(Vis Yoo ++ > Yus +--+) both belong to /,, then so does the point 


(%1 + V1, Xe + Ye oo yg XE + Vas ees) 


The function (7) obviously has the first two defining properties of a distance. 
To verify the triangle inequality, which takes the form 


[Sc = Z,)" < [Seu = Ye) + [So ~: Zy)" (8) 


for the metric (7), we first note that all three series converge, for the reason 
just given. Moreover, the inequality 


[Se = 2“) < [Ses = yn)” + [So _ Z,) (9) 


holds for all n, as shown in Example 3. Taking the limit as n — oo in (9), 
we get (8), thereby verifying the triangle inequality in J,. Therefore /, is a 
metric space. 


Example 8. As in Example 6, consider the set of all functions continuous 
on the interval [a, 5], but this time define distance by the formula 


A(x, y) = ( i [x() — yy? a) (10) 


instead of (6). The resulting metric space will be denoted by Cj, ,,. The 
first two properties of the metric are obvious, and the fact that (10) satisfies 
the triangle inequality is an immediate consequence of Schwarz’s inequality 


(cove at) < feo at|” y(t) dt (11) 
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(see Problem 3), by the continuous analogue of the argument given in 
Example 3. 


Example 9. Next consider the set of all bounded infinite sequences of real 
numbers x = (%1, X%q,...,Xz,---), and let? 


(x, y) = sup [X_ — Vel- (12) 


This gives a metric space which we denote by m. The fact that (12) has the 
three properties of a metric is almost obvious. 


Example 10. As in Example 3, consider the set of all ordered n-tuples 
xX = (x,,...,%,) of real numbers, but this time define the distance by the 
more general formula 
Ip 


esl 9) = (Sh - nt) (13) 


where p is a fixed number >1 (Examples 3 and 4 correspond to the cases 
p =2 and p = 1, respectively). This gives a metric space, which we denote 
by R%. Itis obvious that p,(x, y) = Oifand only if x = y and that p,(x, y) = 
?,(¥. x), but verification of the triangle inequality for the metric (13) requires 
a little work. Let 


X= (X06 6 Xpby Y= Vie e so Vads Z= (Zar+ ++ 9 Zn) 
be three points in R”, and let 
Ay = Xn — Yur On = Ve — % (K=1,...,M), 
just as in Example 3. Then the triangle inequality 
Pp(XsZ) < Py(X, Z) + PAY, Z) 
takes the form of Minkowski’s inequality 


I/p 


(Sen + owr)"< (Siar) + (Zour) ” (14) 


The inequality is obvious for p = 1, and hence we can confine ourselves to 
the case p > 1. 
The proof of (14) for p > 1 is in turn based on Hélder’s inequality 


n. n up [ 2 i/¢ 
¥ lasbal < (Saal)” (Sion) ay 
k=1 e=1 k=1 
where the numbers p > 1 and q > | satisfy the condition 
Pahoa (16) 
Po4 


5 The least upper bound or supremum of a sequence of real numbers @j, do, ..., Gk +++ 
is denoted by sup a,. 
k 
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We begin by observing that the inequality (15) is homogeneous, i.e., if it 
holds for two points (a,,...,4@,) and (b;,...,6,), then it holds for any 
two points (Aqd,,...,Ad,) and (ud,,..., wb,) where and y are arbitrary 
real numbers. Therefore we need only prove (15) for the case 


Dla? = Dloel? = 1. (17) 
k=1 k= 
Thus, assuming that (17) holds, we now 
H prove that 
D Y lady < 1. (18) 
k=l 
Consider the two areas S, and S, shown in 
Figure 8, associated with the curve in the Ey- 
plane defined by the equation 
a é n= br, 
Hees or equivalently by the equation 
Ba yt, 
Then clearly 


s,= ['e “&-<, S.= [Pat a= 
Moreover, it is apparent from the figure that 
S, + S, > ab 
for arbitrary positive a and b. It follows that 


Dp @ 
ppt. (19) 
Pp q 


Setting a = |a,|, b = [b,|, summing over & from 1 to n, and taking account 
of (16) and (17), we get the desired inequality (18). This proves Hélder’s 
inequality (15). Note that (15) reduces to Schwarz’s inequality if p = 2. 

It is now an easy matter to prove Minkowski’s inequality (14), starting 
from the identity 


(lal + 15)? = (lal + (6)? Jal + (lal + 16))?* 1a. 


In fact, setting a = a,, b = b, and summing over k from 1 to n, we obtain 
D (lal + [belY” = DClael + 1x1? bated + Darel + [Bel)?™ [Bel 
k=1 k=1 k=1 


Next we apply Hélder’s inequality (15) to both sums on the right, bearing 
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in mind that (p — 1)q = p: 


Slaul + ib? < Sal+ io) ( [ Stor] "+ [Si sr)”) 


Dividing both sides of this inequality by 


n I/g 
(5, (lal + 16.0") 
k= 


we get 
1/p 1/p 
> 


n 1jp n n 
(2(leal + toed") < (Sle) + (Slo?) 
k=1 k=1 k=1 
which immediately implies (14), thereby proving the triangle inequality in Rr. 
Example 11. Finally let 1, be the set of all infinite sequences 
KS! (Nyy Hodes st Gs he.) 
of real numbers satisfying the convergence condition 
x < © 
k=1 
for some fixed number p > 1, where distance between points is defined by 
o 1/p 
Aa) = (SI — iP) (20) 


(the case p = 2 has already been considered in Example 7). It follows from 
Minkowski’s inequality (14) that 


n Up n 1/p n U/p 
(Sb.- am?) < (Siar) + (Sv) (21) 
k=] k=1 r=1 
for any n. Since the series 
Dlxel?, Del” 
k=1 k=1 
converge, by hypothesis, we can take the limit as n — oo in (21), obtaining 
1/p 


oy 1/p ro) yp 
(dis ad vil) < (SP!) + (S)") < 00. 
kat k=1 kal 


This shows that (20) actually makes sense for arbitrary x, ye/,. At the same 
time, we have verified that the triangle inequality holds in /, (the other two 
properties of a metric are obviously satisfied). Therefore /, is a metric space. 


Remark. If R = (X, e) is a metric space and M is any subset of X, then 
obviously R* = (M, pe) is again a metric space, called a subspace of the 
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original metric space R. This device gives us infinitely more examples of 
metric spaces. 


5,2. Continuous mappings and homeomorphisms. Isometric spaces. Let f 
be a mapping of one metric space X into another metric space Y, so that 
f associates an element y = f(x) € Y with each element x € X. Then f is 
said to be continuous at the point x,€ X if, given any « > 0, there exists a 
3 > 0 such that 

ef), f 0) <e 


p(x, Xo) < 5 


whenever 


(here o is the metric in X and ¢’ the metric in Y). The mapping f is said 
to be continuous on X if it is continuous at every point x € X, 


Remark. This definition reduces to the usual definition of continuity 
familiar from calculus if X and Y are both numerical sets, i.e., if fis a real 
function defined on some subset of the real line. 


Given two metric spaces X and Y, let f be one-to-one mapping of X onto 
Y, and suppose f and f-1 are both continuous. Then f is called a homeo- 
morphic mapping, or simply a homeomorphism (between X and Y). Two 
spaces X and Yare said to be homeomorphic if there exists a homeomorphism 
between them. 


Example. The function 


y=f(x) = 2 are tan x 
Tw 


establishes a homeomorphism between the whole real line (— 00, 00) and the 
open interval (—1, 1). 


DEFINITION 2. A one-to-one mapping f of one metric space R = (X, e) 
onto another metric space R' = (Y, p’) is said to be an isometric mapping 
(or simply an isometry) if 


e(%4, Xg) = (f(x), f (%2)) 


for all x,, X,€R. Correspondingly, the spaces R and R' are said to be 
isometric (to each other). 


Thus if R and R’ are isometric, the ‘“‘metric relations’? between the 
elements of R are the same as those between the elements of R’, i.e., R and 
R’ differ only in the explicit nature of their elements (this distinction is 
unimportant from the standpoint of metric space theory). From now on, 
we will not distinguish between isometric spaces, regarding them simply as 
identical. 
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Remark. We will discuss continuity and homeomorphisms from a more 
general point of view in Sec. 9.6. 
Problem 1. Given a metric space (X, p), prove that 


a) |e(x, z) — p(y, wl < p(x, y) + p(z,u) (x, y, z, ue X); 
b) |e(x,z) — pO. 2)1< py) = (* y, ZX). 


Problem 2. Verify that 
n 2 n n 122 
(22-5) =a bi -— => Y(ab; — ba)’. 
kal k=l = 2i=1 j=1 
Deduce the Cauchy-Schwarz inequality (3) from this identity. 
Problem 3. Verify that 


( Prov a) = Pec at[*y( at — ; PP ecsyyce) — y(syx(n Pas at. 


Deduce Schwarz’s inequality (11) from this identity. 
Problem 4. What goes wrong in Example 10, p. 41 if p < 1? 
Hint. Show that Minkowski’s inequality fails for p < 1. 


Problem 5. Prove that the metric (5) is the limiting case of the metric (13) 
in the sense that 


n i/p 
po(x, ¥) = max |x, — y,| = lim (2h aa vl) : 
1<k<n pro \k=l 


Problem 6. Starting from the inequality (19), deduce Hélder’s integral 
inequality 


[xy at < (Risor a}"(Livor)" ( i‘) 


P 
valid for any functions x(t) and y(t) such that the integrals on the right exist. 


Problem 7, Use Hélder’s integral inequality to prove Minkowski’s integral 
inequality 


({[ix + cor ar)" < ([ecor) + (fora) @> ov. 


Problem 8. Exhibit an isometry between the spaces C,, ,; and C,, »)- 


6. Convergence. Open and Closed Sets 


6.1. Closure of a set. Limit points. By the open sphere (or open ball) 
(x, r) in a metric space R we mean the set of points x € R satisfying the 
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inequality So 
ie) Xoo x <r 


(p is the metric of R).4 The:fixed point xp is called the center of the sphere, 
and the number r is called its radius. By the closed sphere (or closed ball) 
S{[xo, r] with center x) and radius r we mean the set of points x € R satisfying 
the inequality 

(Xo, x) <r. 


An open sphere of radius ¢ with center x, will also be called an e-neighborhood 
of xX, denoted by O,(%9). 

A point x € R is called a contact point of a set M © R if every neighbor- 
hood of x contains at least one point of M. The set of all contact points of a 
set M is denoted by [4/4] and is called the closure of M. Obviously M < [M], 
since every point of M is a contact point of M. By the closure operator in 
a metric space R, we mean the mapping of R into R carrying each set MC R 
into its closure [M]. 


THEOREM |. The closure operator has the following properties: 


1) If MC N, then [M] < [N]; 
2) [LM] = [14]; 

3) [MUN] = [M] v [J]; 

4) [o@]=@. 


Proof. Property 1) is obvious. To prove property 2), let x € [[M]]. 
Then any given neighborhood O,(x) contains a point x; € [M]. Consider 
the sphere O,,(x1) of radius 


& = € — p(x, Xj). 


Clearly O,,(%1) is contained in O,(x). In fact, if z€O,(x,), then 
o(z, x1) < ¢, and hence, since p(x, x;) = ¢ — &, it follows from the 
triangle inequality that 


e(z, x) <a + (e—g)=¢, 


ie., z€O,(x). Since x, € [M], there is a point x,¢ M in O, (x). But 
then x, € O,(x) and hence x € [M], since O,(x) is an arbitrary neighbor- 
hood of x. Therefore [[44]} < [M]. But obviously [M4] < [[M]] and 
hence [[4]] = [M], as required. 

To prove property 3), let x € [M U N] and suppose x ¢ [M] U [N]. 
Then x ¢ [M] and x ¢ [N]. But then there exist neighborhoods O, (x) 
and O,,(x) such that O, (x) contains no points of M while O, (x) contains 





4 Any confusion between ‘“‘sphere” meant in the sense of spherical surface and “‘sphere” 
meant in the sense of a solid sphere (or ball) will always be avoided by judicious use of the 
adjectives ‘‘open”’ or “‘closed.”’ 
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no points of N. It follows that the neighborhood O,(x), where « = 
min {¢,, €g}, Contains no points of either M or N, and hence no points 
of M UN, contrary to the assumption that x € [M U N]. Therefore 
x € [M] U [N], and hence 


[MUN] [M] VU [N], (1) 


since x is an arbitrary point of [M U N]. On the other hand, since 
McCMUN and NC MUN, it follows from property 1) that 
[M] ¢ [M U N]and [N] ¢ [M UN]. But then 


[M] U [N] - [MU N}, 


which together with (1) implies [M U N] = [M] VU [N]. 
Finally, to prove property 4), we observe that given any M © R, 


[M] = [MU 2]= [M) Vv [2], 


by property 3). It follows that [@] < [M]. But this is possible for 
arbitrary M only if [@] = @. (Alternatively, the set with no elements 
can have no contact points!) 


A point x € R is called a Jimit point of a set M © Rif every neighborhood 
of x contains infinitely many points of M. The limit point may or may not 
belong to M. For example, if M is the set of rational numbers in the interval 
[0, 1], then every point of [0, 1], rational or not, is a limit point of M. 

A point x belonging to a set M is called an isolated point of M if there 
is a (‘sufficiently small’’) neighborhood of x containing no points of M other 
than x itself. 


6.2. Convergence and limits. A sequence of points {x,} = %1,%2,..., 
X,;+-. in a metric space R is said to converge to a point xe R if every 
neighborhood O,(x) of x contains all points x, starting from a certain index 
(more exactly, if, given any < > 0, there is an integer N, such that O,(x) 
contains all points x, with n > N,). The point x is called the /imit of the 
sequence {x,}, and we write x, > x (as n —> oo). Clearly, {x,} converges to 
x if and only if 

lim e(x, x,) = 0. 
n> oO 


It is an immediate consequence of the definition of a limit that 


1) No sequence can have two distinct limits; 
2) If a sequence {x,} converges to a point x, then so does every subse- 
quence of {x,} 


(give the details). 
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THEOREM 2, A necessary and sufficient condition for a point x to bea 
contact point of a set M is that there exist a sequence {x,} of points of M 
converging to x. 


Proof. The condition is necessary, since if x is a contact point of M, 
then every neighborhood O,,,(x) contains at least one point x, € M, 
and these points form a sequence {x,} converging to M. The sufficiency 
is obvious. ff 


THEOREM 2’. A necessary and sufficient condition for a point x to be a 
limit point of a set M is that there exist a sequence {x,} of distinct points 
of M converging to x. 


Proof. Clearly, if x is a limit point of M, then the points x, € 
O,;,(x) O M figuring in the proof of Theorem 2 can be chosen to be 
distinct. This proves the necessity, and the sufficiency is again obvious. §f 


6.3. Dense subsets. Separable spaces. Let 4 and B be two subsets of a 
metric space R. Then 4 is said to be dense in B if [A] > B. In particular, 
A is said to be everywhere dense (in R) if [A] = R. A set A is said to be 
nowhere dense if it is dense in no (open) sphere at all. 


Example 1. The set of all rational points is dense in the real line R?. 


Example 2. The set of all points x = (x1, %2,..., ¥,) with rational co- 
ordinates is dense in each of the spaces R", R? and R? introduced in Examples 
3-5, pp. 38-39. 


Example 3. The set of all points x = (x1, x2,...,%,,-..) with only 
finitely many nonzero coordinates, each a rational number, is dense in the 
space /, introduced in Example 7, p. 39. 


Example 4, The set of all polynomials with rational coefficients is dense 
in both spaces C,, ,; and C7, ,, introduced in Examples 6 and 8, pp. 39 and 
40. 


DEFINITION. A metric space is said to be separable if it has a countable 
everywhere dense subset. 


Example 5. The spaces R', R”, RG, Rf, ly, Cyan), and Cf, ,, are all separable, 
since the sets in Examples 1-4 above are all countable. 


Example 6. The “‘discrete space” M described in Example 1, p. 38 con- 
tains a countable everywhere dense subset and hence is separable if and only 
if it is itself a countable set, since clearly [MM] = M in this case. 


Example 7. There is no countable everywhere dense set in the space m of 
all bounded sequences, introduced in Example 9, p. 41. In fact, consider 
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the set E of all sequences consisting exclusively of zeros and ones. Clearly, 
E has the power of the continuum (recall Theorem 6, Sec. 2.5), since there 
is a one-to-one correspondence between E and the set of all subsets of the 
set Z, = {1,2,...,n,...} (describe the correspondence). According to 
formula (12), p. 41, the distance between any two points of E equals 1. 
Suppose we surround each point of E by an open sphere of radius 4, thereby 
obtaining an uncountably infinite family of pairwise disjoint spheres. Then 
if some set M is everywhere dense in m, there must be at least one point of 
M in each of the spheres. It follows that M cannot be countable and hence 
that m cannot be separable. 


6.4. Closed sets. We say that a subset M of a metric space R is closed if it 
coincides with its own closure, i.e., if [M4] = M. In other words, a set is 
called closed if it contains all its limit points (see Problem 2). 


Example 1. The empty set @ and the whole space R are closed sets. 

Example 2. Every closed interval [a, 6] on the real line is a closed set. 

Example 3. Every closed sphere in a metric space is a closed set. In 
particular, the set of all functions f in the space C,, ,, such that | f(t)| < K 
(where K is a constant) is closed. 

Example 4. The set of all functions fin C,,,; such that | f(t)| < K (an 


open sphere) is not closed. The closure of this set is the closed sphere in the 
preceding example. 


Example 5. Any set consisting of a finite number of points is closed. 


THEOREM 3. The intersection of an arbitrary number of closed sets is 
closed. The union of a finite number of closed sets is closed. 


Proof. Given arbitrary sets F, indexed by a parameter «, let x be a 
limit point of the intersection 


F=f) Fy. 
a 


Then any neighborhood O,(x) contains infinitely many points of F, and 
hence infinitely many points of each F,. Therefore x is a limit point of 
each F,, and hence belongs to each F,, since the sets F, are all closed. 
It follows that x € F, and hence that F itself is closed. 

Next let 


be the union of a finite number of closed sets F,, and suppose x does 
not belong to F. Then x does not belong to any of the sets F,, and hence 
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cannot be a limit point of any of them. But then, for every X, there is a 
neighborhood O,,(x) containing no more than a finite number of points 
of F,,. Choosing we 
e = min {e,,...,¢,}, 


we get a neighborhood O,(x) containing no more than a finite number of 
points of F, so that x cannot be a limit point of F. This proves that a 
point x ¢ F cannot be a limit point of F. Therefore Fis closed. J 


6.5. Open sets. A point x is called an interior point of a set M if x has a 
neighborhood O,(x) < M, i.e., a neighborhood consisting entirely of points 
of M. A set is said to be open if its points are all interior points. 


Example 1. Every open interval (a, b) on the real line is an open set. In 
fact, if a< x <b, choose « = min {x — a, b — x}. Then clearly O,(x) < 
(a, b). 


Example 2. Every open sphere S(a,r) in a metric space is an open set. 
In fact, x € S(a, r) implies p(a, x) < r. Hence, choosing « = r — p(a, x), we 
have O,(x) = S(x, ©) © S(a,r). 


Example 3. Let M be the set of all functions f in C,,,, such that f (t) < 
g(t), where g is a fixed function in C,, ,). Then M is an open subset of C,, ,,. 


THEOREM 4. A subset M of a metric space R is open if and only if its 
complement R — M is closed. 


Proof. If M is open, then every point x € M has a neighborhood 
(entirely) contained in M. Therefore no point x € M can be a contact 
point of R — M. In other words, if x is a contact point of R— M, 
then x € R— M,i.e., R — M is closed. 

Conversely, if R — M is closed, then any point x € M must have a 
neighborhood contained in M, since otherwise every neighborhood of x 
would contain points of R— M, i.e., x would be a contact point of 
R—M notin R—M. Therefore Mis open. §f 


Coro_iary. The empty set @ and the whole space R are open sets. 

Proof. An immediate consequence of Theorem 4 and Example 1, 
Sec. 6.4. J 

THEOREM 5. The union of an arbitrary number of open sets is open. The 
intersection of a finite number of open sets is open. 


Proof. This is the “dual” of Theorem 3. The proof is an immediate 
consequence of Theorem 4 and formulas (3)-(4), p. 4. §j 
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6.6. Open and closed sets on the real line. The structure of open and closed 
sets in a given metric space can be quite complicated. This is true even for 
open and closed sets in a Euclidean space of two or more dimensions 
(R",n > 2). In the one-dimensional case, however, it is an easy matter to 
give a complete description of all open sets (and hence of all closed sets): 


THEOREM 6. Every open set G on the real line is the union of a finite or 
countable system of pairwise disjoint open intervals.® 


Proof. Let x be an arbitrary point of G. By the definition of an open 
set, there is at least one open interval containing x and contained in G. 
Let I, be the union of all such open intervals. Then, as we now show, J, 
is itself an open interval. In fact; let® 


a = inf I,, b= sup I, 
(where we allow the cases a = —co and b = +00). Then obviously 
I, € (a, 6). (2) 


Moreover, suppose y is an arbitrary point of (a, 5) distinct from x, 
where, to be explicit, we assume that a < y < x. Then there is a point 
y' eI, such that a < y’ < y (why?). Hence G contains an open interval 
containing the points y’ and x. But then this interval also contains y, 
ie., yEI,. (The case y > x is treated similarly.) Moreover, the point 
x belongs to I,, by hypothesis. It follows that I, > (a, b), and hence by 
(2) that I,, = (a, b). Thus I, is itself an open interval, as asserted, in fact 
the open interval (a, 6). 

By its very construction, the interval (a, b) is contained in G and is 
not a subset of a larger interval contained in G. Moreover, it is clear 
that two intervals I, and I, corresponding to distinct points x and x’ 
either coincide or else are disjoint (otherwise I, and I. would both be 
contained in a larger interval J, U I,, = I ¢ G. There are no more than 
countably many such pairwise disjoint intervals J,. In fact, choosing an 
arbitrary rational point in each I, we establish a one-to-one correspond- 
ence between the intervals J, and a subset of the rational numbers. 
Finally, it is obvious that 


c=Ur,. I 


Coro_iary. Every closed set on the real line can be obtained by deleting 
a finite or countable system of pairwise disjoint intervals from the line. 


5 The infinite intervals (— 0, 0), (a, 0), and (— ©, 6) are regarded as open. 
® Given a set of real numbers E, inf E denotes the greatest lower bound or infimum 
of E, while sup E denotes the least upper bound or supremum of E. 
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Proof. An immediate consequence of Theorems 4 and 6. J 


Example 1. Every closed interval (a, 5] is a closed set (here a and b are 
necessarily finite). 


Example 2. Every single-element set {x9} is closed. 


Example 3. The union of a finite number of closed intervals and single- 
element sets is a closed set. 


Example 4 (The Cantor set). A more interesting example of a closed set 
on the line can be constructed as follows: Delete the open interval (4, %) 
from the closed interval Fy = [0, 1], and let F, denote the remaining closed 
set, consisting of two closed intervals. Then delete the open intervals 

%, %) and (%, §) from F,, and let F, denote the remaining closed set, con- 
sisting of four closed intervals. Then delete the “middle third’’ from each 
of these four intervals, getting a new closed set F3, and so on (see Figure 9). 
Continuing this process indefinitely, we get a sequence of closed sets F,, such 
that 

FoP> FP RP OD R,>:: 


(such a sequence is said to be decreasing). The intersection 


5] 
F=N"F, 

n=0 
of all these sets is called the Cantor set. Clearly F is closed, by Theorem 3, 
and is obtained from the unit interval [0, 1] by deleting a countable number 
of open intervals. In fact, at the nth stage of the construction, we delete 
27-1 intervals, each of length 1/3*. 

To describe the structure of the set F, we first note that F contains the 

points 


0, 1, 3, %, $3 3, %, 4, aS (3) 
i.e., the end points of the deleted intervals (together with the points 0 and 1). 
Q { 6 
0 3 5 
3 3 A 
gr VE. EF e -8 Y 
3 33 1_39 «9 b 
ee, AE oe a 
ae on es ee ec ee -s ee Fa 


FIGURE 9 
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However F contains many other points. In fact, given any x € [0, 1], suppose 
we write x in ternary notation, representing x as a series 


where each of the numbers a, a,,...,4,,... can only take one of the three 
values 0, 1, 2. Then it is easy to see that x belongs to F if and only if x has a 
representation (4) such that none of the numbers a,, dg,...,@,,... equals 
1 (think things through)’ 

Remarkably enough, the set F has the power of the continuum, i.e., 
there are as many points in F as in the whole interval [0, 1], despite the fact 
that the sum of the lengths of the deleted intervals equals 


b+34+ep teal. 


To see this, we associate a new point 
by , bs 
po—e4+e4ee- fest 
ae ae 


with each point (4), where® 
0 if a,=0, 
1 if a, = 2. 


In this way, we set up a one-to-one correspondence between F and the whole 

interval [0, 1]. It follows that F has the power of the continuum, as asserted. 

Let A, be the set of points (3). Then F = A, U A,, where the set A, = F — A, 

is uncountable, since A, is countable and F itself is not. The points of A, 

are often called “points (of F) of the first kind,”’ while those of A, are called 
“points of the second kind.” 


Problem 1. Give an example of a metric space R and two open spheres 
S(x, r,) and S(y, r,) in R such that S(x, 1.) < S(y, re) although ry > rp. 


Problem 2. Prove that every contact point of a set M is either a limit point 
of M or an isolated point of M. 


7 Just as in the case of ordinary decimals, certain numbers can be written in two 
distinct ways. For example, 


0 0 bh 0 > 0 2 2 . 2 . 
+oa+qt+ tot =3tptgt tat 7 
Since none of the numerators in the second representation equals 1 the point 4 belongs 
to F (this is already obvious from the construction of F). 

8 If x has two representations of the form (4), then one and only one of them has no 
numerators @, d2,...,@n,. . . equal to 1. These are the numbers used to define 5,. 
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Comment. In particular, [M] can only contain points of the following 
three types: ” 

a) Limit points of M belonging to M; 

b) Limit points of M which do not belong to M; 

c) Isolated points of M. 


Thus [M] is the union of M and the set of all its limit points. 


Problem 3. Prove that if x, > x, y, > y as n— oo, then 0(x,, y,) > 
e(x, y). 
Hint. Use Problem la, p. 45. 


Problem 4, Let f be a mapping of one metric space X into another metric 
space Y. Prove that fis continuous at a point x, if and only if the sequence 
{yn} = {f(@,)} converges to y =f (x9) whenever the sequence {x,} con- 
verges to Xp. 


Problem 5. Prove that 


a) The closure of any set M is a closed set; 
b) [4] is the smallest closed set containing M. 


Problem 6. Is the union of infinitely many closed sets necessarily closed? 
How about the intersection of infinitely many open sets? Give examples. 


Problem 7. Prove directly that the point 4 belongs to the Cantor set F, 
although it is not an end point of any of the open intervals deleted in con- 
structing F. 


Hint. The point } divides the interval [0,1] in the ratio 1:3. It also 
divides the interval [0, 4] left after deleting (4, %) in the ratio 3:1, and so on. 


Problem 8. Let F be the Cantor set. Prove that 


a) The points of the first kind, i.e., the points (3) form an everywhere 
dense subset of F; 
b) The numbers of the form t, + ¢,, where ¢,, t, € F, fill the whole interval 
{0, 2]. 
Problem 9. Given a metric space R, let A be a subset of R and x a point 
of R. Then the number 
(A, x) = inf o(a, x) 
aeA 


is called the distance between A and x. Prove that 


a) x € A implies e(A, x) = 0, but not conversely; 

b) (A, x) is a continuous function of x (for fixed A); 

c) e(A, x) = 0 if and only if x is a contact point of A; 

d) [A] = A U M, where 4 is the set of all points x such that e(A, x) = 0. 
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Problem 10. Let A and B be two subsets of a metric space R. Then the 
number 


(A, B)= inf 9(a, b) 
aca 
veB 
is called the distance between A and B. Show that p(A, B) =OifA NBA @, 
but not conversely. 


Problem 11. Let Mx be the set of all functions fin C, 
Lipschitz condition, i.e., the set of all f such that 


If) —f@)| < Kl4 — al 


for all t,, t, € [a, b], where K is a fixed positive number. Prove that 


a,v) Satisfying a 


a) Mx is closed and in fact is the closure of the set of all differentiable 
functions on [a, 5] such that | f’(1)| < K; 
b) The set 


M=UM, 
K 


of all functions satisfying a Lipschitz condition for some K is not 
closed ; 
c) The closure of M is the whole space C,, ,}- 


Problem 12. An open set G in n-dimensional Euclidean space R” is said 
to be connected if any points x, y€G can be joined by a polygonal line® 
lying entirely in G. For example, the (open) disk x? + y? < 1 is connected, 
but not the union of the two disks 


e+p<l, (x—BW+y<i 


(even though they share a contact point). An open subset of an open set G 
is called a component of G if it is connected and is not contained in a larger 
connected subset of G. Use Zorn’s lemma to prove that every open set G in 
R® is the union of no more than countably many pairwise disjoint com- 
ponents. 


Comment. In the case n = | (i.e., on the real line) every connected open 
set is an open interval, possibility one of the infinite intervals (— 00, 00), 
(a, 0), (—0, b). Thus Theorem 6 on the structure of open sets on the line 
is tantamount to two assertions: 


1) Every open set on the line is the union of a finite or countable number 
of components; 
2) Every open connected set on the line is an open interval. 


® By a polygonal line we mean a curve obtained by joining a finite number of straight 
line segments end to end. 
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The first assertion holds for open sets in R® (and in fact is susceptible to 
further generalizations), while the second assertion pertains specifically to 
the real line. 


7. Complete Metric Spaces 


7.1. Definitions and examples. The reader is presumably already familiar 
with the notion of the completeness of the real line. The real line is, of course, 
a particularly simple example of a metric space. We now make the natural 
generalization of the notion of completeness to the case of an arbitrary 
metric space. 


DEFINITION 1. A sequence {x,} of points in-a metric space R with metric 
p is said to satisfy the Cauchy criterion if, given any « > O, there is an 
integer N, such that o(Xp, Xn) < for alln, n' > N,. 


DEFINITION 2. .A subsequence {x,,} of points in a metric space R is called 
a Cauchy sequence (or a fundamental sequence) if it satisfies the Cauchy 
criterion. 


THEOREM 1. Every convergent sequence {x,} is fundamental. 


Proof. If {x,} converges to a limit x, then, given any < > 0, there is 
an integer N, such that 


€ 
P(Xns x) < 2 
for alln > N,. But then 
O(Xns Xn) < P%n» X) + (Xn X) << 
for alln,n'>N,. | 


DEFINITION 3. A metric space R is said to be complete if every Cauchy 
sequence in R converges to an element of R. Otherwise R is said to be 
incomplete. 


Example 1. Let R be the “space of isolated points” considered in Example 
1, p.38. Then the Cauchy sequences in R are just the “stationary sequences,”’ 
i.e., the sequences {x,} all of whose terms are the same starting from some 
index n. Every such sequence is obviously convergent to an element of R. 
Hence R is complete. 


Example 2. The completeness of the real line Ris familiar from elemen- 
tary analysis. 
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Example 3. The completeness of Euclidean n-space R” follows from that 
of R1. In fact, let 
x? = (x!) xP) (p= 1,2,...) 


be a fundamental sequence of points of R”. Then, given any < > 0, there 
exists an N, such that 


n 
Zon” = ay < 2 


for all p,q > N,. it follows that 
xi? — x <e  (k=1,.-.,”) 
for all p, g > N,, i.e., each {x} is a fundamental sequence in R'. Let 


Se rere 7) 


where 
= lim x?" 
a D> 0 
Then obviously 
limx” = x, 


pro 

This proves the completeness of R”. The completeness of the spaces Rf and 

R® introduced in Examples 4 and 5, p. 39 is proved in almost the same way 
(give the details). 

Example 4. Let {x,(¢)} be a Cauchy sequence in the function space C,, 4, 

considered in Example 6, p. 39. Then, given any « > 0, there is an N, such 


that 
Ixn(t) — Xy(t)| << (1) 


for all n, n' > N, and all te [a,b]. It follows that the sequence {x,(t)} is 
uniformly convergent. But the limit of a uniformly convergent sequence of 
continuous functions is itself a continuous function (see Problem 1). Taking 
the limit as n’ — oo in (1), we find that 
Ix,(t) — x(t) < © 

for all n > N, and all t € [a, 5], ie., {x,(t)} converges in the metric of C,, 41 
to a function x(t) €C,, ,;. Hence C,, ,, is a complete metric space. 

Example 5. Next let x‘ be a sequence in the space /, considered in 
Example 7, p. 39, so that 


(n) (n) y(n) 
He es (Ny Kae ey eis Se hye ees 


oO) <o (n=1,2,...). 
k= 


58 METRIC SPACES CHAP. 2 
Suppose further that {x'} is a Cauchy sequence. Then, given any ¢ > 0, 
there is a NV, such that 
a7 ro) : 
e(x'™, xm ) = 20x” ne at 2 <e (2) 
ifn, n’ > N,. It follows that 
(xi — xl Pee (k= 1,2,...), 


ie., for every k the sequence {xi} is fundamental and hence convergent. 


Let 


Xp = lim xy”; 


NFO 


XS (Xyy Mosca’ Xe cts): 


Then, as we now show, x is itself a point of /, and moreover {x} converges 
to x in the /, metric, so that /, is a complete metric space. 
In fact, (2) implies 


M 
D4" — x)" <e (3) 
k=1 

for any fixed M. Holding a fixed in (3) and taking the limit as n’ — oo, we get 
M 
Dd,” — x,)’ < «. (4) 
k=1 


Since (4) holds for arbitrary M, we can in turn take the limit of (4) as M — oo, 
obtaining 

foe} 

YOy” — 4)? <«. (5) 


k=1 


Just as on p. 40, the convergence of the two series 
oO 2) 
Yoy’, sa” = x,)* 
k=1 k=1 
implies that of the series 
oO 
+ x; 
k=1 


This proves that x €/,. Moreover, since ¢ is arbitrarily small, (5) implies 


eo 
lim (x, x) = lim | (x — x,)? = 0, 
n> 0 nro k=1 


i.e., {x} converges to x in the /, metric, as asserted. 
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Example 6. It is easy to show that the space C7, ,, of Example 8, p. 40 is 
incomplete. If 


—1 if ee ee 

n 
; 1 1 

th={\ at if -—-<t<-, 
Pall) c : 


te a. epee 
n 


then {¢,,(¢)} is a fundamental sequence in Cap since 


2 


is [pn(t) = Pn(t)] dt< 2a 
: min {n, n'} 


However, {¢,(¢)} cannot converge to a function in Chaar In fact, consider 
the discontinuous function 


<j Gf +e. 
wo=| 


1 if r>0. 


Then, given any function fe Chaap it follows from Schwarz’s inequality 
(obviously still valid for piecewise continuous functions) that 


1/ 


(Pay (t) — yor). ae ( ft f) — oAOF at) an ( Fite, ~ wor an)" 


But the integral on the left is nonzero, by the continuity of f, and moreover 
it is clear that 


lim [*[en(é) — YO} dt = 0. 
Therefore 


[uo — er ae 


cannot converge to zero as n — 00. 


7.2. The nested sphere theorem. A sequence of closed spheres 
S[x1. 01], Slxe, re], ..- , Sq, ral, » 
in a metric space R is said to be nested (or decreasing) if 
SP] > SPs te] 2 ++ > S[Xas ta] Po °* 


Using this concept, we can prove a simple criterion for the completeness of R: 
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THEOREM 2 (Nested sphere theorem). A metric space R is complete if 
and only if every nested sequence {S,,} = {S[Xn; 'n)} of closed spheres in 
R such that r,,—» 0 as n— 0 has a nonempty intersection 

foe} 
NS,. 
n=1 

Proof. If R is complete and if {S,} = {S[x,, r,]} is any nested se~ 
quence of closed spheres in R such that r,—>0 as n— oo, then the 


sequence {x,,} of centers of the spheres is fundamental, since p(x, X,) < 
r, for n’ > nandr,—»0asn-—> oo. Therefore {x,} has a limit. Let 


x =lim x,. 
n> oO 
Then 
oO 
xeNS,,. 
n=l 


In fact, S, contains every point of the sequence {x,,} except possibly the 
points x, X2,..., Xp, 1, and hence x is a limit point of every sphere S,,. 
But S,, is closed, and hence x € S, for all 7. 

Conversely, suppose every nested sequence of closed spheres in R 
with radii converging to zero has a nonempty intersection, and let {x,} 
be any fundamental sequence in R. Then x has a limit in R. To see this, 
use the fact that {x,,} is fundamental to choose a term x,, of the sequence 
{x,} such that 


1 
en, Xn,) < 2 


for all > nj, and let S, be the closed sphere of radius 1 with center x,, . 
Then choose a term x,,, of {x,} such that n, > n, and 


1 

Xn» Xng) < 2 

for alln > nz, and let S, be the closed sphere of radius } with center x,,,. 

Continue this construction indefinitely, i.c., once having chosen terms 

Xny> Xngr e+ 9Xn, (ty < Mac < n,), choose a term x, such that 
Nyw > Ny, and 


1 
AX ns Xnyar) < eH 


for all n > n,4,, let S;,,, be the closed sphere of radius 1/2* with center 
Xn,,» and so on. This gives a nested sequence {S,,} of closed spheres 
with radii converging to zero. By hypothesis, these spheres have a non- 
empty intersection, i.e., there is a point x in all the spheres. This point 
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is obviously the limit of the sequence {x, }. But if a fundamental se- 
quence contains a subsequence converging to x, then the sequence itself 
must converge to x (why 2), i.e., 
limx, =x. ff 
n7>o 
7.3. Baire’s theorem. It will be recalled from Sec. 6.3 that a subset A of a 
metric space R is said to be nowhere dense in R if it is dense in no (open) 
sphere at all, or equivalently, if every sphere S © R contains another sphere 
S’ such that S’ 1 .A = @ (check the equivalence). This concept plays an 
important role in 


THEOREM 3 (Baire). A complete metric space R cannot be represented 
as the union of a countable number of nowhere dense sets. 


Proof. Suppose to the contrary that 


R=UA,, (6) 
n=1 

where every set A,, is nowhere dense in R. Let Sy © R be a closed sphere 
of radius 1. Since A, is nowhere dense in Sy, being nowhere dense in R, 
there is a closed sphere S, of radius less than } such that S; © Sy and 
S, 0 Ay = @. Since A, is nowhere dense in S,, being nowhere dense 
in Sp, there is a closed sphere S, of radius less than 4 such that S, < S, 
and S, © A, = @, and so on. In this way, we get a nested sequence of 
closed spheres {S,,} with radii converging to zero such that 


S,04,= 9 (n=1,2,...). 


By the nested sphere theorem, the intersection 

Oo 

Ns, 

n=1 
contains a point x. By construction, x cannot belong to any of the 
sets A,, i.€., 

ics) 
x€UA,. 


n=1 


It follows that 
foe) 
RZUA,, 


n=1 


contrary to (6). Hence the representation (6) is impossible. 


CoROLLaRY. A complete metric space R without isolated points is 
uncountable, 


Proof. Every single-element set {x} is nowhere dense in R. 
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7.4. Completion of a metric space. As we now show, an incomplete metric 
space can always be enlarged (in an essentially unique way) to give a complete 
metric space. 


DEFINITION 4. Given a metric space R with closure [R], a complete 
metric space R* is called a completion of R if R < R* and [R] = R*, 
Le., if R is a subset of R* everywhere dense in R*. 


Example I. Clearly R* = R if Ris already complete (see Problem 7). 


Example 2. The space of all real numbers is the completion of the space 
of all rational numbers. 


THEOREM 4. Every metric space R has a completion. This completion 
is unique to within an isometric mapping carrying every point x € R into 
itself. 


Proof. The proof is somewhat lengthy, but completely straight- 
forward. First we prove the uniqueness, showing that if R* and R** 
are two completions of R, then there is a one-to-one mapping x** = 
@(x*) of R* onto R** such that g(x) = x for all x € R and 


e1(x*, y*) as Po(x**, y**) (7) 


(y** = ¢(y*)), where p, is the distance in R* and p, the distance in R**. 
The required mapping ¢ is constructed as follows: Let x* be an arbitrary 
point of R*. Then, by the definition of a completion, there is a sequence 
{x,} of points of R converging to x*. The points of the sequence {x,} 
also belong to R**, where they form a fundamental sequence (why ?). 
Therefore {x,} converges to a point x** € R**, since R** is complete. 
It is clear that x** is independent of the choice of the sequence {x,} 
converging to the point x* (why ?). If we set o(x*) = x**, then 9 is 
the required mapping. In fact, p(x) = x for all x € R, since if x, > x 
€ R, then obviously x = x* € R*,x** = x. Moreover, suppose x,,-—> x*, 
Yn y* in R*, while x, > x**, y,— y** in R**. Then, if p is the 
distance in R, 


eax", ¥*) = Lim ex Yn) = Lim en Yn) (8) 
(see Problem 3, p. 54), while at the same time 
palx™", y**) = Tim pal%n» Yn) = lim en» Yn). (8’) 
But (8) and (8’) together imply (7). 


We must now prove the existence of a completion of R. Given an 
arbitrary metric space R, we say that two Cauchy sequences {x,} and 
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{%,} in R are equivalent and write {x,,} ~ {%,} if 

lim e(x,, ¥,) = 0. 

n7o 
As anticipated by the notation and terminology, ~ is reflexive, sym- 
metric and transitive, i.e., ~ is an equivalence relation in the sense of 
Sec. 1.4. Therefore the set of all Cauchy sequences of points in the space 
R can be partitioned into classes of equivalent sequences. Let these 
classes be the points of a new space R*. Then we define the distance 
between two arbitrary points x*, y* € R* by the formula 


or(x*, y*) = lim (Xn, Vn)s (9) 


where {x,,} is any “representative” of x* (namely, any Cauchy sequence 
in the class x*) and {y,} is any representative of y*. 

The next step is to verify that (9) is in fact a distance, i.e., that (9) 
exists, is independent of the choice of the sequences {x,} €x*, {y,} €y*, 
and satisfies the three properties of a distance figuring in Definition 1, 
p: 37. Given any ¢ > 0, it follows from the triangle inequality in R 
(recall Problem 1b, p. 45) that 
|e(%ns Yn) — 0% ns Yn)| 

= 9% ns Yn) = Xn» Vn) +H PX ns Yn) ~ PKm, Yw)| 


< |e(X,) Yn) = Xn, yal oF 1o(%n's Yn) 2 Xn, Yn 
< Xn tw) + Pn In) <S+ 5 = (10) 


for all sufficiently large n and n’. Therefore the sequence of real numbers 
{s,} = {p(*_, ¥n)} is fundamental and hence has a limit. This limit is 
independent of the choice {x,} € x*, {y,} €y*. In fact, suppose 


{Xpbttyh SO. AVate Pat EP*- 
Then 
[0(Xns Yn) — PCEn» Indl < C%n» Fn) + PVnr In) 


by a calculation analogous to (10). But 
lim 9x, %,,) = lim (Yn, 9,) = 0, 
since {x,} ~ {%,}, {Yn} ~ {Fn}, and hence 
lim e(%n, Yn) = lim eZ, ¥,). 


As for the three properties of a metric, it is obvious that e,(x*, y*) = 
ei(y*, x*), and the fact that p,(x*, y*) = 0 if and only if x* = y* is an 
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immediate consequence of the definition of equivalent Cauchy sequences. 
To verify the triangle inequality in R*, we start from the triangle inequality 


POa» Zn) < OEn» In) + PV Zn) 
in the original space R and then take the limit as n > oo, obtaining 


lim e(Xp, Zn) < lim e(%,, Yn) +1im (Yn, Zn) 
r no no NAO 
ie., 
er(x*, 2*) < py(x*, y*) + ei(9*, 2*). 


We now come to the crucial step of showing that R* is a completion 
of R. Suppose that with every point x € R, we associate the class x* € R* 
of all Cauchy sequences converging to x. Let 
x =lim x,, y =lim y,,. 
NFO no 


Then clearly 

os y) = lim e(%ns Yn) 
(recall Problem 3, p. 54), while on the other hand 

er(x*, y*) = lim AXns Vn)» 
by definition. Therefore 
(x, y) = ei(x*, y*), 

and hence the mapping of R into R* carrying x into x* is isometric. 
Accordingly, we need no longer distinguish between the original space R 
and its image in R*, in particular between the two metrics p and p, 
(recall the relevant comments on p. 44). In other words, R can be re- 


garded as a subset of R*. The theorem will be proved once we succeed 
in showing that 


1) Ris everywhere dense in R*, i.e., [R] = R; 
2) R* is complete. 


To this end, given any point x* € R* and any ¢ > 0, choose a rep- 
resentative of x*, namely a Cauchy sequence {x,} in the class x*. Let 
N be such that o(x,, x,,) < efor all n,n’ > N. Then 


(Xn, X*) = lim (Xn, Xn) < € 
n’- 0 


ifn > N, i.e., every neighborhood of the point x* contains a point of R. 
It follows that [R] = R. 

Finally, to show that R* is complete, we first note that by the very 
definition of R*, any Cauchy sequence {x,} consisting of points in R 
converges to some point in R*, namely to the point x* € R* defined by 
{x,,}. Moreover, since R is dense in R*, given any Cauchy sequence 
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{xt} consisting of points in R*, we can find an equivalent sequence {x,,} 
consisting of points in R. In fact, we need only choose x, to be any 
point of R such that p(x,, xz) < l/n. The resulting sequence {x,} is 
fundamental, and, as just shown, converges to a point x* € R*. But then 
the sequence {x*} also converges to x*. 


Example. If R is the space of all rational numbers, then R* is the space of 
all real numbers, both equipped with the distance p(x, y) = |x — y|. In this 
way, we can “construct the real number system.’’ However, there still 
remains the problem of suitably defining sums and products of real numbers 
and verifying that the usual axioms of arithmetic are satisfied. 


Problem 1. Prove that the limit f(t) of a uniformly convergent sequence 
of functions {f,()} continuous on [a, 5] is itself a function continuous on 
[a, ). 


Hint. Clearly 
If) —f(t)l < IFO — fA, + IF) — Sato) + | Fn(to) — f (to) 


where f, t) € [a, b]. Use the uniform convergence to make the sum of the 
first and third terms on the right small for sufficiently large n. Then use the 
continuity of f(t) to make the second term small for ¢ sufficiently close to fp. 


Problem 2. Prove that the space m in Example 9, p. 41 is complete. 


Problem 3. Prove that if R is complete, then the intersection () S, 
figuring in Theorem 2 consists of a single point. ieee 


Problem 4. By the diameter of a subset A of a metric space R is meant the 
number 


d(A) = sup e(x, y). 
a.yed 


Suppose R is complete, and let {A,} be a sequence of closed subsets of R 
nested in the sense that 


A, > A, > °°'DA,>°:: 
Suppose further that 
lim d(A,,) = 0. 


ioe} 
Prove that the intersection (] A, is nonempty. 
n=1 
Problem 5. A subset A of a metric space R is said to be bounded if its 
diameter d(A) is finite. Prove that the union of a finite number of bounded 
sets is bounded. 


66 METRIC SPACES CHAP. 2 


Problem 6. Give an example of a complete metric space R and a nested 
sequence {A,} of closed subsets of R such that 


ce) 

NA, = o. 

n=1 
Reconcile this example with Problem 4. 


Problem 7. Prove that a subspace of a complete metric space R is com- 
plete if and only if it is closed. 


Problem 8. Prove that the real line equipped with the distance 
(x, y) = jarc tan x — arc tan y| 
is an incomplete metric space. 


Problem 9. Give an example of a complete metric space homeomorphic 
to an incomplete metric space. 


Hint. Consider the example on p. 44. 


Comment. Thus homeomorphic metric spaces can have different “‘metric 
properties.” 


Problem 10. Carry out the program discussed in the last sentence of the 
example on p. 65. 


Hint. If {x,} and {y,} are Cauchy sequences of rational numbers serving 
as “‘representatives”’ of real numbers x* and y*, respectively, define x* + y* 
as the real number with representative {x,, + y,}. 


8. Contraction Mappings 


8.1. Definition of a contraction mapping. The fixed point theorem. Let A 
be a mapping of a metric space R into itself. Then x is called a fixed point 
of A if Ax = x, i.¢., if A carries x into itself. Suppose there exists a number 
a < 1 such that 

e(Ax, Ay) < a(x, y) (1) 


for every pair of points x, y€ R. Then A is said to be a contraction mapping. 
Every contraction mapping is automatically continuous, since it follows from 
the “contraction condition” (1) that Ax, — Ax whenever x, — x. 


THEOREM | (Fixed point theorem). Every contraction mapping A 
defined on a complete metric space R has a unique fixed point. 


10 Often called the method of successive approximations (see the remark following 
Theorem 1) or the principle of contraction mappings. 
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Proof. Given an arbitrary point x» R, let"! 
Xy = Axo, Xg = Ax, = A*X%,..., Xp = AXg_y = AX... (2) 


Then the sequence {x,} is fundamental. In fact, assuming to be explicit 
that n < n’, we have 


Xn» Xn!) = e(A"X9, A”’x9) < a” e(Xo, Xn'—n) 


< a"[o(Xo, %y) + (x1, X2) + — + P(X y/—n—1 Xn—n)] 





< a(x, HUE fat ar prrs tar] < a"9(x9, x4) eae ! F 
—a 


But the expression on the right can be made arbitrarily small for suffi- 
ciently large n, since « <1. Since R is complete, the sequence {x,}, 
being fundamental, has a limit 

x =lim x,. 


no 


Then, by the continuity of A, 


Ax = A lim x, =lim Ax, = lim x,4, = x. 

n>oO n> ao nro 
This proves the existence of a fixed point x. To prove the uniqueness of x, 
we note that if 

Ax =x, Ay=y, 
(1) becomes 
o(x, y) < a(x, y). 

But then o(x, y) = Osincea <1,andhencex=y. J 


Remark. The fixed point theorem can be used to prove existence and 
uniqueness theorems for solutions of equations of various types. Besides 
showing that an equation of the form Ax = x has a unique solution, the 
fixed point theorem also gives a practical method for finding the solution, i.e., 
calculation of the “successive approximations’’ (2). In fact, as shown in 
the proof, the approximations (2) actually converge to the solution of the 
equation Ax = x. For this reason, the fixed point theorem is often called 
the method of successive approximations. 


Example 1. Let f be a function defined on the closed interval [a, b] which 
which maps [a, 5] into itself and satisfies a Lipschitz condition 


f(x) — f%2)| < K [xy — xel, (3) 


with constant K <1. Then f is a contraction mapping, and hence, by 


11 42x means A(Ax), A®x means A(A?x) = A*(Ax), and so on. 
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Ficure 10 
Theorem 1, the sequence 


Xo» xy =f (%), Xe =f(%),... (4) 


converges to the unique root of the equation f(x) = x. In particular, the 
“contraction condition’ (3) holds if fhas a continuous derivative f’ on [a, b] 
such that 

POl< K<1. 


The behavior of the successive approximations (4) in the cases 0 < f’(x) <1 
and —1 < f"(x) < 0 is shown in Figures 10 and 11. 


Example 2. Consider the mapping A of n-dimensional space into itself 
given by the system of linear equations 


(5) 





DX XXX Wry b 


FiGcure 11 
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If A is a contraction mapping, we can use the method of successive approxi- 
mations to solve the equation Ax = x. The conditions under which A is a 
contraction mapping depend on the choice of metric. We now examine three 
cases: 


1) The space R® with metric 


e(x, y) = max |x; — yi]. 
. 1<i<n 
In this case, 





e(y, §) = max |y; — J,| = max > a(x; — &,) 
i i 7 





< max > \a,;| |x; es &,| 
a 2 


< max ¥ lays max ix; — $41 = (max ¥ laul) (x 9 

a 7 ¢ 

and the contraction condition 
> laul << a<1 (i= 1,...,7). (6) 
7 


2) The space R? with metric 


n 
o(x, y) = z Ix; — yal 
Here 


POI) = Thi — HM = Z| Daves — | 


< EE lal bts — 851 < (max ¥ lau) oC 9, 

t 3 a 

and the contraction condition is now 
Dlask<a<1  (j=1,...,). (7) 
a 


3) Ordinary Euclidean space R” with metric 


ats, 9) = / 3 Gu — wi 


Using the Cauchy-Schwarz inequality, we have 


YY, D=> ( yx 5 — #))< (=5 a) e'(x, X), 


ry 


and the contraction condition becomes 


BI ay < ac. (8) 
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Thus, if at least one of the conditions (6)—(8) holds, there exists a unique 
point x = (x, x%.,...,%,) such that 


now 
X= Dax, +5; (=1,...,2). (9) 


The sequence of successive approximations to this solution of the equation 
x = Ax are of the form 


(0) 0) 1.00) (0 
eS aaa als 


@Q) _ ¢y ,G) (1) 
x ) == (xf » Xy Miaka) 


(kh) (yh) (re) (x) 
x = (x; 9X2 prea Xn )s 


2 


where 
“ _~< (k-1) 
xe? = Da, jxf" + b;, 
j=1 


and we can choose any point x as the “zeroth approximation.” 

Each of the conditions (6)—(8) is sufficient for applicability of the method 
of successive approximations, but none of them is necessary. In fact, examples 
can be constructed in which each of the conditions (6)-(8) is satisfied, but 
not the other two. 


Theorem 1 has the following useful generalization, which will be needed 
later (see Example 2, p. 75): 


THEOREM 1’. Given a continuous mapping of a complete metric space R 
into itself, suppose A” is a contraction mapping (n an integer > 1). Then 
A has a unique fixed point. 

Proof. Choosing any point x, € R, let 

x = lim A’"x,. 


k7 0 


Then, by the continuity of A, 


Ax = lim AA*"x. 
ko ow 


But A” is a contraction mapping, and hence 
P(A AX, Ax) < ap(AM DMA, AVM) < ++ < alt(AX a, Xo) 
where « < 1. It follows that 
(Ax, x) = lim o( AA*"x9, A*”’x9) = 0, 
70 


i.e., Ax = x0 that xis a fixed point of A. To prove the uniqueness of x, 
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we merely note that if A has more than one fixed point, then so does A”, 
which is impossible, by Theorem 1, since A” is a contraction 


mapping. J 


8.2. Contraction mappings and differential equations. The most interesting 
applications of Theorems | and 1’ arise when the space R is a function 
space. We can then use these theorems to prove a number of existence and 
uniqueness theorems for differential and integral equations, as shown in this 
section and the next. 


THEOREM 2 (Picard). Given a function f(x, y) defined and continuous 
on a plane domain G containing the point (xo, Y),"* suppose f satisfies a 
Lipschitz condition of the form 


in the variable y. Then there is an interval |x — X»| < 8 in which the 
differential equation 


d 
= Sey) (10) 
has a unique solution 
y= 9X) 
satisfying the initial condition 
$(X0) = Yo. (11) 


Proof. Together the differential equation (10) and the initial condition 
(11) are equivalent to the integral equation 


oe) =o + f° s0, oO) at. (12) 
By the continuity of f, we have 
If@yl<K (13) 


in some domain G’ € G containing the point (Xp, yo).!* Choose 5 > 0 
such that 
1) (x, y) EG’ if |x — x9l < 8, ly — yol < KS; 
2) Ms <1, 
and let C* be the space of continuous functions ¢ defined on the interval 
12 By an n-dimensional domain we mean an open connected set in Euclidean n-space 


R* (connectedness is defined in Problem 12, p. 55). 
13 In fact, fis bounded on [G’] if [G’] < G (cf. Theorem 2, p. 110). 
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|x — X9| < 8 and such that |o(x) — yl < KS, equipped with the metric 
aC ¢,,9) = max |9(x) — §). 

The space C* is complete, since it is a closed subspace of the space of all 


continuous functions on [x) — §, x) + 8]. Consider the mapping | = 
Ag defined by the integral equation 


WO)= yet [fe eat (xxl <8). 


Clearly A is a contraction mapping carrying C* into itself. In fact, if 
e €C*, |x — x9| < 5 then 





0) — vol =|f- £4 oO) del < [F1/U, elon at < K [x — xl < KB 


by (13), and hence | = A@ also belongs to C*. Moreover, 


Ma) — GI! < [IFC @(0) — FC HO)] dt < MB max |o(x) — GDL, 
and hence : 
ad, 9) < MBo(e, 8) 


after maximizing with respect to x. But M8 <1, so that A is a con- 
traction mapping. It follows from Theorem | that the equation 9 = Ag, 
i.e., the integral equation (12), has a unique solution in the space C*. § 


Theorem 2 can easily be generalized to the case of systems of differential 
equations: 


THEOREM 2’. Given n functions f(x, yi, ..., Yn) defined and continuous 
on an (n + 1)-dimensional domain G containing the point 


(Xo, Yow exere » Yon)s 
suppose each f, satisfies a Lipschitz condition of the form 


| fil, Yo ec | Vn) — fix, Vs exe eng, nJ)I < bret ly a Jil 


in the variables yy, ... , Yn. Then there is an interval |x — x9| < Sin which 
the system of differential equations 
d 


SU =f Yoo In) (i =1,...,n) (14) 
x 


has a unique solution 
n= $i(x),--- Yn = Pn(X) 
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satisfying the initial conditions 


P1(%o) = Yors +++ > Pn(Xo) = Yon (15) 


Proof. Together the differential equations (14) and the initial con- 
ditions (15) are equivalent to the system of integral equations 


a 
PAX) = Yor + [Ale Prt) «++» On(t)) at (i= 1,...,m). (16) 
By the continuity of the functions f,, we have 


Fis Vis ees Val < K G=1,...,”) (17) 


in some domain G’ © G containing the point (Xp, Yor; - - - » Yon)» Choose 
3 > 0 such that 


1) (x, Vis-- +s Yn) EG" if |x — xol < 8, ly; — Yo:| < KS for all i= 
| eres 


2) M8 <1. 
This time let C* be the space of ordered n-tuples 


= (G15-+ +> Pn) 
of continuous functions 9), ..., ¢, defined on the interval |x — x,| < 8 
such that |9,(x) — yo| < K8 for all i=1,...,n, equipped with the 
metric 


e(9, 9) = eae lex) — Ox). 


Clearly C* is complete. Moreover, the mapping ) = Ag defined by the 
system of integral equations 


lx) = Voi + [fae ot); $30. #<9 @n(t 


(Ix — xo] < 8,i=1,... 


is a contraction mapping carrying C* into” 





P = (Pir-- +> Pr) EC* a 
then A & . ves 
nC) — vol =] J AG a, ae oe 
by (17), so that ) = (,. » R x oe 
Ix) — 3.00} = f ° ws _ case of (18) by extending the 
iS 
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and hence ’ 
o(y, y) < MS80(9,; 9) 


after maximizing with réspect to x and i. But M8 < 1, so that A isa 
contraction mapping. It follows from Theorem 1 that the equation 
~ = Ag, ie., the system of integral equations (16), has a unique solution 
in the spaceC*. J 


8.3. Contraction mappings and integral equations. We now show how the 
method of successive approximations can be used to prove the existence and 
uniqueness of solutions of integral equations. 


Example 1. By a Fredholm equation (of the second kind) is meant an 
integral equation of the form 


F0) = APKC 0) & + 90. (18) 


involving two given functions K and 9, an unknown function f and an 
arbitrary parameter A. The function K is called the kernel of the equation, 
and the equation is said to be homogeneous if » =0 (but otherwise non- 
homogeneous). 

Suppose K(x, y) and (x) are continuous on the square a <x <b, 
a<y< b,so that in particular 


IK yi< Mo (a<cx<cba<cy< Db). 


Consider the mapping g = Af of the complete metric space C,, ,, into itself 


given by 


a,b] 


a(x) =a [K(x »)fO) dy + oC. 
Clearly, if g, = Afi, g2 = Af, then 
(81, Be) = max [a(x) — ga(x)| < IA M(b — a) max |fi(x) — fa)! 
= |A| M(b — a)e( fy, fe), 
so that A is a contraction mapping if 
In < arr (19) 


It follows from Theorem | that the integral equation (18) has a unique 
solution for any value of 4 satisfying (19). The successive approximations 
So fis--+sfn».-- to this solution are given by 


ful =A] KOs fei dy te) (=1,2,..), 
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where any function continuous on [a, b] can be chosen as fy. Note that the 
method of successive approximations can be applied to the equation (18) 
only for sufficiently small ||. 


Example 2. Next consider the Volterra equation 


fe) =f Kx, SQ) dy + 0), (20) 


which differs from the Fredholm equation (18) by having the variable x 
rather than the fixed number 5 as the upper limit of integration." It is easy 
to see that the method of successive approximations can be applied to the 
Volterra equation (20) for arbitrary 4, not just for sufficiently small |A| as 
in the case of the Fredholm equation (18). In fact, let A be the mapping 
of C,_,n; into itself defined by 


Af(x) =] K(x, ) 0) dy + (0), 
and let fi, f2€ C,,5; Then 
Af) — Af = 2 KC, NLL) — AON] dy 


< AM(x — a) max | f(x) — fa(x)], 
where 
M = max |K(x, y)|. 
“Ley 


It follows that 
JAYf(x) — AYR) < 2EM® max [fi(a) — G9 [oe — a) ax 
= nim &— a) ay. C2 max [file) — fi 


and in general, 
|A*A(x) — Af] < pon Ea max L(x) — filo 


n n b— 
< eM C—O" max fila) — A) 
which implies 


(Af, Af) < 2M 29" C= Y ol fis fad. 





“ Equation (20) can be regarded formally as a special case of (18) by extending the 
definition of the kernel, i.e., by setting 


K(x,y)=0 if y>-x. 


76 METRIC SPACES CHAP. 2 
But, given any A, we can always choose n large enough to make 
mmr 2a" <4 
n!} ; 


i.e., A” is a contraction mapping for sufficiently large n. It follows from 
Theorem 1’ that the integral equation (20) has a unique solution for arbitrary A. 


Problem 1. Let A be a mapping of a metric space R into itself. Prove that 
the condition 


(Ax, Ay) < e(x,y) (x #Y) 
is insufficient for the existence of a fixed point of A. 


Problem 2, Let F(x) be a continuously differentiable function defined on 
the interval [a, b] such that F(a) < 0, F(b) > 0 and 


0<K,< F’'™)< K (a<x< b). 
Use Theorem I to find the unique root of the equation F(x) = 0. 


Hint. Introduce the auxiliary function f(x) = x — AF(x), and choose A 
such that the theorem works for the equivalent equation f(x) = x. 


Problem 3. Devise a proof of the implicit function theorem based on the 
use of the fixed point theorem.’ 


Problem 4. Prove that the method of successive approximations can be 
used to solve the system (9) if |a,,| < 1/n (for alli and), but not if |a,,| = 1/n. 


Problem 5. Prove that the condition (6) is necessary for the mapping (5) 
to be a contraction mapping in the space R?. 


Problem 6. Prove that any of the conditions (6)-(8) implies 


a,—1 Ay. Qn 
Ay Ag, — 1 Gon 
#0. 
ani ane mt Ann 1 


Comment. Hence the fact that the system (5) has a unique solution (under 
suitable conditions) follows from Cramer’s rule as well as from the fixed 
point theorem. 





15 See e.g., 1. G. Petrovski, Ordinary Differential Equations (translated by R. A. Silver- 
man), Prentice-Hall, Inc., Englewood Cliffs, N.J. (1966), p. 47. 
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Problem 7. Consider the nonlinear integral equation 
db 
fe) =A KO ys FO) dy + 9X) (21) 
with continuous K and 9, where K satisfies a Lipschitz condition of the form 


|K(x, y3 1) — K(x, y3 22)| < Mz, — 29] 


in its “functional” argument. Prove that (21) has a unique solution for all 


M(b — a)" 


Write the successive approximations to this solution. 


3 


TOPOLOGICAL SPACES 


9. Basic Concepts 


9.1. Definitions and examples. In our study of metric spaces, we defined 
a number of key ideas like contact point, limit point, closure of a set, etc. 
In each case, the definition rests on the notion of a neighborhood, or, what 
amounts to the same thing, the notion of an open set. These notions (neigh- 
borhood and open set) were in turn defined by using the metric (or distance) 
in the given space. However, instead of introducing a metric in a given set 
X, we can go about things differently, by specifying a system of open sets 
in X with suitable properties. This approach leads to the notion of a topo- 
logical space. Metric spaces are topological spaces of a rather special 
(although very important) kind. 


DEFINITION 1. Given a set X, by a topology in X is meant a system ~ of 
subsets G © X, called open sets (relative to ~), with the following two 
properties: 

1) The set X itself and the empty set @ belong to 7; 
2) SG (finite or infinite) unions U G,, and finite intersections 


ia G, of open sets belong to v. 


DEFINITION 2. By a topological space is meant a pair (X, 7), consisting 
of a set X and a topology « defined in X. 
Just as a metric space is a pair consisting of a set X and a metric defined in 
X, soa topological space is a pair consisting of a set X and a topology defined 
in X. Thus, to specify a topological space, we must specify both a set X and 
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a topology in X, i.e., we must indicate which subsets of X are to be regarded 
as “open (in X).”? Clearly, we can equip one and the same set with various 
different topologies, thereby defining various different topological spaces. 
Nevertheless, we will usually denote a topological space, namely a pair (X, 7), 
by a single letter like 7. Just as in the case of a metric space R, the elements 
of a topological space T will be called the points of T. 

By the closed sets of a topological space T, we mean the complements 
T — G of the open sets G of T. It follows from Definition 1 and the “duality 
principle’ (see p. 4) that 


1’) The space T itself and the empty set @ are closed; 


2') Arbitrary (finite or infinite) intersections N F,, and finite unions U F, 
of closed sets of T are closed. ea 


The natural way of introducing the concepts of neighborhood, contact 
point, limit point and closure of a set is now apparent: 


a) By a neighborhood of a point x in a topological space T is meant any 
open set G © T containing x; 

b) A point x € T is called a contact point of a set M © T if every neigh- 
borhood of x contains at least one point of M; 

c) A point x € T is called a limit point of a set M < T if every neighbor- 
hood of x contains infinitely many points of M; 

d) The set of all contact points of a set M © T is called the closure of 
M, denoted by [M]. 


Example 1. According to Theorem 5, p. 50, the open sets in any metric 
space satisfy the two properties in Definition 1. Hence every metric space 
is a topological space as well. 


Example 2. Given any set T, suppose we regard every subset of T as open. 
Then T is a topological space (the properties in Definition 1 are obviously 
satisfied). In particular, every set M © Tis both open and closed, and every 
set M ¢ T coincides with its own closure. Note that the “discrete metric 
space” of Example 1, p. 38 has this trivial topology. 


Example 3. As another extreme case, consider an arbitrary set T equipped 
with a topology consisting of just two sets, the whole set T and the empty 
set @. Then T is a topological space, a kind of “space of coalesced points” 
(mainly of academic interest). Note that the closure of every nonempty set 
is the whole space T. 


Example 4, Let T be the set {a, b}, consisting of just two points a and b, 
and let the open sets in T be T itself, the empty set and the single-element set 
{b}. Then the two properties in Definition | are satisfied, and T is a topo- 
logical space. The closed sets in this space are T itself, the empty set and the 
set {a}. Note that the closure of {b} is the whole space T. 
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9.2. Comparison of topologies. Let +, and +, be two topologies defined 
in the same set X.1 Then we say that the topology 7, is stronger than the 
topology t, (or equivalently that t, is weaker than 7,) if t, © 7, ie, if 
every set of the system tz is a set of the system 7. 


THEOREM |. The intersection t = Nn T, of any set of topologies in X 
is itself a topology in X. 


Proof. Clearly f) +, contains X¥ and @. Moreover, since every *,, is 
a 


closed (algebraically) under the operations of taking arbitrary unions and 
finite intersections, the same is true of f] t,. @ 
a 


CoroLiary. Let & be any system of subsets of a set X. Then there 
exists a minimal topology in X containing @, i.e., a topology t(@) con- 
taining @ and contained in every topology containing B. 


Proof. A topology containing @ always exists, e.g., the topology 
in which every subset of X is open. The intersection of all topologies 
containing @ is the desired minimal topology ~(@), often called the 
topology generated by the system &. 


Let @ be a system of subsets of X and A a fixed subset of ¥. Then by 
the trace of the system # on the set A we mean the system #, consisting of 
all subsets of X of the form 4 1 B, BE@. It is easy to see that the trace 
(on A) of a topology + (defined in X) is a topology t4 in A. (Such a topol- 
ogy is often called a relative topology.) In this sense, every subset A of a 
given topological space (X, +) generates a new topological space (A, +4), 
called a subspace of the original topological space (X, 7). 


9,3. Bases. Axioms of countability. As we have seen, defining a topology 
in a space T means specifying a system of open sets in T. However, in many 
concrete problems, it is more convenient to specify, instead of all the open 
sets, some system of subsets which uniquely determines all the open sets. 
For example, in the case of a metric space we first introduced the notion of 
an open sphere (<-neighborhood) and then defined an open set G as a set such 
that every point x € G has a neighborhood O,(x) ¢ G. In other words, the 
open sets in a metric space are precisely those which can be represented as 
finite or infinite unions of open spheres. In particular, the open sets on the 
real line are precisely those which can be represented as finite or countable 
unions of open intervals (recall Theorem 6, p. 51). These considerations 
suggest 


1 This gives two topological spaces T, = (X, 71) and T, = (X, 72). 
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DEFINITION 3. A family Y of open subsets of a topological space T is 
called a base for T if every open set in T can be represented as a union of 
sets in G, 


Example 1. The set of all open spheres (of all possible radii and with all 
possible centers) in a metric space R is a base for R. In particular, the set 
of all open intervals is a base on the real line. The set of all open intervals 
with rational end points is also a base on the line, since any open interval 
(and hence any open set on the line) can be represented as a union of such 
intervals. 


It is clear from the foregoing that a topology + can be defined in a set T 
by specifying a base ¥ in T. This topology + is just the system of sets which 
can be represented as unions of sets in Y. If this way of specifying a topology 
is to be of practical value, we must find requirements which, when imposed 
on a system Y of subsets of a given set 7, guarantee that the system + of all 
possible unions of sets in Y be a topology in T, i.e., that + have the two 
properties figuring in Definition 1: 


THEOREM 2. Given a set T, let Y be a system of subsets G,, < T with the 
following two properties: 


1) Every point x € T belongs to at least one G, € GY; 
2) Ifx € G, A Gg, then there is a G,, € G such that x € G, © G, O Gg. 


Suppose the empty set @ and all sets representable as unions of sets G, 
are designated as open. Then T isa topological space, and G is a base for T. 


Proof. It follows at once from the conditions of the theorem that the 
whole set T and the empty set @ are open sets, and that the union of any 
number of open sets is open. We must still show that the intersection of 
a finite number of open sets is open. It is enough to prove this for just 
two sets. Thus let 

A = U Ga, B = U Gg. 
Then 
A 0B = U (Ge 9 Ge). (1) 


By hypothesis, given any point x € G, M Gg, there is a G, € Y such that 
xeEG, © G, A Gy. Hence the set G, M G, is open, being the union of 
all G, contained in G, Gg. But then (1) is also open. Therefore T is a 
topological space. The fact that Y is a base for T is clear from the way 
open sets in T are defined. JJ 


The following theorem is a useful tool for deciding whether or not a 
given system of open sets is a base: 
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THEOREM 3. A system Y of open sets G, in a topological space T is a 
base for T if and only if, given any open set G © T and any point x EG, 
there is a set G,€ Y such'that x EG, © G. 


Proof. If Y is a base for T, then every open set G © Tis a union 
G=Uc, 


of sets G,¢ Y. Therefore every point x €G is contained in some set 
G, < G. Conversely, given any open set G © T, suppose that for every 
point x € G there is a set G(x) € Y such that x € G,(x) < G. Then 
G =UG A», 
xe 
ie., Gis aunion of setsing. J 


Example 2. It follows from Theorem 3 that the set of all open spheres 
with rational radii (and all possible centers) in a metric space R is a base for 
R (this is obvious anyway). In particular, as already noted in Example 1, 
the set of all open intervals with rational end points is a base for the real line. 


An important class of topological spaces consists of spaces with a countable 
base, i.e., spaces in which there is at least one base containing no more than 
countably many sets. Such a space is also said to satisfy the second axiom of 
countability. 


THEOREM 4. If a topological space T has a countable base, then T con- 
tains a countable everywhere dense subset, i.e., a countable set M < T 
such that [M] = T. 


Proof. Let GY = {G,, Gz,...,G,,...} be a countable base for T, 
and choose a point x, in each G,. Then the set 


M = {x Xo). 26 y Nyy 


is countable. Moreover, M is everywhere dense in T, since otherwise 
the nonempty open set G = T — [M] would contain no points of M. 
But this is impossible, since G is a union of some of the sets G, in Y and 
G,, contains the point x,¢ M. jj 


For metric spaces, we can say even more: 


THEOREM 5. If a metric space R has a countable everywhere dense 
subset, then R has a countable base. 


Proof. Suppose R has a countable everywhere dense subset {x,, 
Xg,-+..,%Xn;+-.$. Then, given any open set G © Rand any x €G, there 
is an open sphere S(x,,, I/n) such that x € S(x,,, I/n) © G for suitable 
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positive integers m and n (why?). Hence the open spheres S(x,,, 1/n), 
where m and n range over all positive integers, form a countable base for 
R. fj 


Combining Theorems 4 and 5, we see that a metric space R has a countable 
base if and only if it has a countable everywhere dense subset. 


Example 3. Every separable metric space, i.e., every metric space with a 
countable everywhere dense subset, is a metric space satisfying the second 
axiom of countability. 


Example 4. The space m of all bounded sequences is not separable (recall 
Example 7, p. 48) and hence has no countable base. 


Remark. In general, Theorem 5 does not hold for arbitrary (nonmetric) 
topological spaces. In fact, examples can be given of topological spaces 
which havea countable everywhere dense subset but no countable base. Let us 
see how this might come about. Given any point x of a metric space R, there 
is a countable neighborhood base (or local base) at x, i.e., a countable system 
0 of neighborhoods of x with the following property: Given any open set G 
containing x, there is a neighborhood O € @ such that O < G (cf. Theorem 
3).2 Suppose every point x of a topological space T has a countable neigh- 
borhood base. Then T is said to satisfy the first axiom of countability. 
However, this axiom need not be satisfied in an arbitrary topological space. 
Hence the argument used in the case of metric spaces to deduce the existence 
of a countable base from that of a countable everywhere dense subset does 
not carry over to the case of an arbitrary topological space. 


A system .@ of sets M, is called a cover (or covering) of a topological 
space T, and .@ is said to cover T, if 


T=U™M.,. 


A cover consisting of open (or closed) sets only is called an open (or closed) 
cover. If .@ is a cover of a topological space T, then by a subcover of &# 
we mean any subset of .@ which also covers T. 


THEOREM 6. If T is a topological space with a countable base GY, then 
every open cover © has a finite or countable subcover. 


Proof. Since © covers T, each point x € T belongs to some open set 
O,€@. Moreover, since Y is a countable base for T, for each xe T 
there is a set G,(x) € Y such that x €G,(x) < O, (recall Theorem 3). 


2 For example, the set of open spheres S(x, 1/n) is a countable neighborhood base at 
any point x of a metric space R. 
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The collection of all sets G,,(x) selected in this way is finite or countable 
and covers T. For each G,,(x) we now choose one of the sets O, containing 
G,,(%), thereby obtaining:a finite or countable subcover of O. Jj 


Given any topological space T, the empty set @ and the space T itself 
are both open and closed, by definition. A topological space T is said to 
be connected if it has no subsets other than @ and T which are both open 
and closed. For example, the real line R! is connected, but not the set 

— {x} obtained from R? by deleting any point x. 


9.4. Convergent sequences in a topological space. The concept of a con- 
vergent sequence, introduced in Sec. 6.2 for the case of a metric space, 
generalizes in the natural way to the case of a topological space. Thus a 
sequence of points {x,} = x1, X2,...,X,,-.. in a topological space T is 
said to converge to a point x € T (called the limit of the sequence) if every 
neighborhood G(x) of x contains all points x, starting from a certain index.® 
However, the concept of a convergent sequence does not play the same basic 
role for topological spaces as for metric spaces. In fact, in the case of a 
metric space R, a point x is a contact point of a set M © R if and only if M 
contains a sequence converging to x. On the other hand, in the case of a 
topological space T, this is in general not true, as shown by Problem 11. 
In other words, a point x can be a contact point of a set M C T (i.e., x can 
belong to [A4]) without M containing a sequence converging to x. However, 
convergent sequences “are given their rights back” if 7 satisfies the first 
axiom of countability, i.e., if mere is a countable neighborhood base at every 
point x eT: 


THEOREM 7. If a topological space T satisfies the first axiom of 
countability, then every contact point x of a set M < T is the limit of a 
convergent sequence of points in M. 


Proof. Let © be a countable neighborhood base at x, consisting of 
sets O,. It can be assumed that O,, +1 © O, (n= 1,2,...), since other- 


wise we need only replace O, by A O,. Let x, be any point of M 


contained in O,. Such a point x, ‘can always be found, since x is a 
contact point of M. Then the sequence {x,} obviously converges to 


x. | 


Remark. As already noted, every metric space satisfies the first axiom 
of countability. This, together with Theorem 7, shows why in the case of 
metric spaces we were able to formulate concepts like contact point, limit 


* More exactly, if, given any G(x), there is an integer Ng such that G(x) contains all 
points x, with 2 > Ng. 
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point, etc. in terms of convergent sequences (recall Theorems 2 and 2’, 
p. 48). 


9.5. Axioms of separation. Although many basic concepts of the theory 
of metric spaces carry over easily to the case of topological spaces, an 
arbitrary topological space is still too general an object for most problems 
of analysis. In fact, things can happen in an arbitrary topological space 
which differ in an essential way from what happens in a metric space. Thus, 
for example, a finite set of points need not be closed in an arbitrary topo- 
logical space, as shown in Example 4, p. 79. Hence it is desirable to 
specialize the notion of a topological space somewhat by considering topo- 
logical spaces more closely resembling metric spaces. This is done by 
imposing extra conditions on a topological space T, in addition to the two 
defining properties figuring in Definition 1, p. 78. For example, as we 
have already seen, the axioms of countability allow us to study topological 
spaces from the standpoint of the concept of convergence. We now introduce 
supplementary conditions, called axioms of separation, of quite a different 
type: 

DEFINITION 4. Suppose that for each pair of distinct points x and y in 
a topological space T, there is a neighborhood O, of x and a neighborhood 
O, of y such that x € O,, y € O,. Then T is said to satisfy the first axiom of 
separation, and is called a T,-space. 


Example 1, The space in Example 2, p. 79 is a T,-space, but not the space 
in Example 4. 


TuHeoreM 8. Every finite subset of a T,-space is closed. 


Proof. Given any single-element set {x}, suppose y # x. Then y 
has a neighborhood O, which does not contain x, i.e., y ¢ [{x}]. There- 
fore [{x}] = {x}, ie., every “singleton”? {x} is closed. But every finite 
union of closed sets is itself closed. Hence every finite subset of the given 
space is closed. 


The next axiom of separation is stronger than the first axiom: 


DEFINITION 5. Suppose that for each pair of distinct points x and y in 
a topological space T, there is a neighborhood O, of x and aneighborhood 
O, of y such that O, \ O, = ©. Then T is said to satisfy the second (or 
Hausdorff) axiom of separation, and is called a T-space or Hausdorff 
space. 


Thus, roughly speaking, each pair of disjoint points in a Hausdorff space 
has a pair of disjoint neighborhoods. 
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Example 2. Every Hausdorff space is a T,-space, but not conversely (see 
Problem 10). 


Topological spaces more general than Hausdorff spaces are rarely used 
in analysis. In fact, most of the topological spaces of interest in analysis 
satisfy a separation condition even stronger than the second axiom of 
separation: 


DEFINITION 6. A 7j-space T is said to be normal if for each pair of 
disjoint closed sets F, and Fy in T, there is an open set O, containing F, 
and an open set O, containing F, such that 0; 1 0,= ©. 


In other words, each pair of disjoint closed sets in a normal space has a 
pair of disjoint “neighborhoods.” 


Example 3, Obviously, every normal space is a Hausdorff space. 


Example 4. Consider the closed unit interval [0, 1], where neighborhoods 
of any point x 4 0 are defined in the usual way (i.e., as open sets containing 
x), but neighborhoods of the point x = 0 are all half-open intervals [0, «) 
with the points 


1 
Le weg ties 2 
5 (2) 


deleted (and arbitrary unions and finite intersections of these neighborhoods 
with neighborhoods of nonzero points). This space is Hausdorff, but not 
normal since the set {0} and the set of points (2) are disjoint closed sets 
without disjoint neighborhoods. 


THEOREM 9. Every metric space is normal. 


Proof. Let X and Y be any two disjoint closed subsets of R. Every 
point x € X has a neighborhood O, disjoint from Y, and hence is at a 
positive distance e, from Y (recall Problem 9, p. 54). Similarly, every 
point y € Y is at a positive distance p, from X. Consider the open sets 


U= U S(x, 3Px)s V= U Sy, $y), 
eX yeY 


where, as usual, S(x, 1) is the open sphere with center x and radius r. 
It is clear that ¥ < U, Y < V. Moreover, U and V are disjoint. In fact, 
suppose to the contrary that there is a point ze U \ V. Then there are 
points x) € X, yo € Y such that 

(Xo, Z) < 30.2,» 0(Z, Yo) < 2,- 
Assume, to be explicit, that Px, < @y,- Then 


(Xs Yo) < (Xo, Z) + P(Z; Yo) < 20z, + 2Py, < Pu,» 
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ie., X» € SQ, Py, ). This contradicts the definition of Py,» and shows that 
there is no point. zeUny. § 


Remark. Every subspace of a metric space is itself a metric space and 
hence normal. This is not true for normal spaces in general, i.e., a subspace 
of a normal space need not be normal.* A property of a topological space 
T shared by every subspace of T is said to be hereditary. Thus normality of a 
space is not a hereditary property. These ideas are pursued in Problems 
13 and 14. 


9.6. Continuous mappings. Homeomorphisms. The concept of a contin- 
uous mapping, introduced for metric spaces in Sec. 5.2, generalizes at once 
to the case of arbitrary topological spaces. Thus, let f be a mapping of one 
topological space X into another topological space Y, so that f associates 
an element y = f(x) € Y with each element xe X. Then f is said to be 
continuous at the point x,€X if, given any neighborhood V, of the point 

= f(%), there is a neighborhood U,, of the point x) such that SU, Y) S 
a The mapping fis said to be continuous on X if it is continuous at every 
point of X. In particular, a continuous mapping of a topological space X 
into the real line is called a continuous real function on X. 


Remark. These definitions clearly reduce to the corresponding definitions 
for metric spaces in Sec. 5.2 if X and Y are both metric spaces. 


The notion of continuity of a mapping f of one topological space into 
another’ is easily stated in terms of open sets, i.e., in terms of the topologies 
of the two spaces: 


THEOREM 10. A mapping f of a topological space X into a topological 
space Y is continuous if and only if the preimage T = f-'(G) of every 
open set G — Y is open (in X). 


Proof. Suppose f is continuous on X, and let G be any open subset 
of Y. Choose any point x el = f-\(G), and let y = f(x). Then Gisa 
neighborhood of the point y. Hence, by the continuity of f, there is a 
neighborhood U, of x such that f(U,,) © G,i.e., U, < TI’. In other words, 
every point x ¢ I has a neighborhood contained in I’. But then I’ is 
open (see Problem 1). 

Conversely, suppose = f-1(G) is open whenever G © Y is open. 
Given any point x € X, let V, be any neighborhood of the point y = f(x). 


4 See e.g., J. L. Kelley, General Topology, D. Van Nostrand Co., Inc., Princeton, N.J. 
(1955), p. 132. 

5 Tf desired, the mapping f can always be regarded as ‘‘onto,” since otherwise we need 
only replace the space Y by the subspace f(X) < Y. 
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Then clearly x ef—1(V,), and moreover f—1(V,) is open, by hypothesis. 
Therefore U, = f—1(V,) is a neighborhood of x such that f(U,) < V,. 
In other words, fis continuous at x and hence on _X, since x is an arbitrary 
point of X. Jf 


Naturally, Theorem 10 has the following “dual’’: 


THEOREM 10’. A mapping f of a topological space X into a topological 
space Y is continuous if and only if the preimage V = f-1(F) of every closed 
set F < Yis closed (in X). 


Proof. Use the fact that the preimage of a complement is the comple- 
ment of the preimage. §j 


Remark. Let X and Y be two arbitrary sets, and let f be a mapping of 
X into Y. Suppose that in Y there is specified a topology 7, i.e., a system 
of sets containing Y and @, and closed under the operations of taking 
arbitrary unions and finite intersections. Then since the preimage of a 
union (or intersection) of sets equals the union (or intersection) of the 
preimages of the sets, by Theorems | and 2, p. 5, the preimage of the 
topology +, i.e., the system of all sets f-1(G) where Ger, is a topology 
in X which we denote by /—(t). 

Suppose now that X¥ and Y are topological spaces, with topologies +x 
and ty, respectively. Then Theorem 10, giving a necessary and sufficient 
condition for a mapping f of X into Y to be continuous can be paraphrased 
as follows: A mapping fof X into Y is continuous if and only if the topology 
vx is stronger than the topology f—(+y). 


Example. It is easy to see that the image (as opposed to the preimage) of 
an open set under a continuous mapping need not be open. Similarly, the 
image of a closed set under a continuous mapping need not be closed. For 
example, consider the mapping of the half-open interval ¥ = [0, 1) onto the 
circle of unit circumference corresponding to “winding” the interval onto 
the circle. Then the set [$, 1), which is closed in [0, 1), goes into a set which 
is not closed on the circle (see Figure 12). 


(0) 


DT 


Ficure 12 
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The theorem on continuity of composite functions, familiar from 
elementary calculus, has the following analogue for topological spaces: 


THEOREM 11. Given topological spaces X, Y and Z, suppose f is a 
continuous mapping of X into Y and @ a continuous mapping of Y into Z. 
Then the mapping of, i.e., the mapping carrying x into »(f(x)), is 
continuous. 


Proof. An immediate consequence of Theorem 10. 


Given two topological spaces X and Y, let fbe a one-to-one mapping of X 
onto Y, and suppose f and f~ are both continuous. Then / is called a 
homeomorphic mapping or simply -a homeomorphism (between X and Y). 
Two spaces X and ¥ are said to be homeomorphic if there exists a homeo- 
morphism between them. Homeomorphic spaces have the same topological 
properties, and from the topological point of view are merely two “repre- 
sentatives” of one and the same space. In fact, if X and Y have topologies 
tx and ty, respectively, and if fis a homeomorphic mapping of X onto Y, 
then ty = f—(ty) and ty =f (rx). The relation of being homeomorphic 
is obviously reflexive, symmetric and transitive, and hence is an equivalence 
relation. Therefore any given family of topological spaces can be partitioned 
into disjoint classes of homeomorphic spaces. 


Remark. Again these are the natural generalizations of the same notions 
for metric spaces, introduced in Sec. 2.2. It should be noted that two homeo- 
morphic metric spaces need not have the same “metric properties’’ (recall 
Problem 9, p. 66). Note also that the topology of a metric space is uniquely 
determined by its metric, but not conversely (illustrate this by an example). 


9.7. Various ways of specifying topologies. Metrizability. The most direct 
and in principle the simplest way of specifying a topology in a space T is to 
indicate which subsets of T are regarded as open. The system of all such 
subsets must then satisfy properties 1) and 2) of Definition 1. By duality, 
we could just as well indicate which subsets of X are regarded as closed. 
The system of all such subsets must then satisfy properties 1’) and 2’) on 
p. 79. However, this method is of limited practical value. For example, in 
the case of the plane it is hardly possible to give a direct description of all 
open sets (as was done in Theorem 6, p. 51 for the case of the line). 

A topology is often specified in a space T by giving a base for T. In 
fact, this is precisely what is done in Sec. 6 for the case of a metric space R, 
where the base for R consists of all open spheres (or even all open spheres 
with rational radii). 

Another way of specifying a topology in a space T is to introduce the 
notion of convergence in 7. As noted in Sec. 9.4, this is not a universal 
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method. It does work, however, in the case of spaces satisfying the first 
axiom of countability.§ 

Still another way of introducing a topology in a space T is to specify 
a closure operator in 7, i.e., a mapping which assigns to each subset M < T 
a subset [MM] < T and satisfies the four properties listed in Theorem 1, 
p. 46. It can be shown that the system of complements of all sets M < T 
such that [M] = M is then a topology in T.’ 

Specifying a metric in a space T is one of the most important ways of 
introducing a topology in T, but it is again far from being a universal method. 
As already noted, every metric space is normal and satisfies the first axiom 
of countability. Hence no metric can be used to introduce a topology in a 
space which fails to have these two properties. A topological space T is said 
to be metrizable if its topology can be specified by means of some metric 
(more exactly, if it is homeomorphic to some metric space). As just pointed 
out, a necessary condition for a topological space T to be metrizable is that 
it be normal and satisfy the first axiom of countability. However, it can be 
shown that these conditions are not sufficient for T to be metrizable. On the 
other hand, in the case of a space with a countable base (i.e., satisfying the 
second axiom of countability), we have 


URYSOHN’S METRIZATION THEOREM. A necessary and sufficient condi- 
tion for a topological space with a countable base to be metrizable is that 
it be normal. 


The necessity follows from Theorem 9. For the sufficiency we refer to the 
literature.® 


Problem 1. Given a topological space T, prove that a set G © Tis open if 
and only if every point x € G has a neighborhood contained in G. 


Problem 2. Given a topological space T, prove that 


a) [M] = M if and only if M is a closed set, i.e., the complement T — G 
of an open set G ¢ T; 

b) [M1] is the smallest closed set containing M; 

c) The closure operator, i.e., the mapping of T into T carrying M into 
[M] satisfies Theorem 1, p. 46. 


Problem 3. Consider the set 7 of all possible topologies defined in a 
set X, where t, < 7, means that +, is weaker than +,. Verify that < is a 
®In fact, by suitably generalizing the notion of convergence (and introducing the 
concepts of ‘‘nets” and “‘filters’’), this method can be made to work quite generally, See 
e.g., J. L. Kelley, op. cit., p. 83. 
7J. L,. Kelley, op. cit., p. 43. 
8 See e.g., P. S. Alexandroff, Einfiihrung in die Mengenlehre und die Theorie der Reellen 
Funktionen, VEB Deutscher Verlag der Wissenschaften, Berlin (1956), p. 195 ff. 
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partial ordering of 7. Does 7 have maximal and minimal elements? If so, 
what are they? 


Problem 4. Can two distinct topologies t, and +, in X generate the same 
relative topology in a subset A © X? 


Problem 5. Let 
X = {a,b,c}, A= {a,b}, B= {b,c}, 
and let Y = {@, X, A, B}. Is Y a base for a topology in X? 


Problem 6. Prove that if M is an uncountable subset of a topological 
space with a countable base, then some point of M is a limit point of M. 


Problem 7. Prove that the topological space T in Example 4, p. 79 is 
connected. 


Comment. T might be called a ‘“‘connected doubleton.” 


Problem 8. Prove that a topological space satisfying the second axiom of 
countability automatically satisfies the first axiom of countability. 


Problem 9. Give an example of a topological space satisfying the first 
axiom of countability but not the second axiom of countability. 


Problem 10. Let + be the system of sets consisting of the empty set and 
every subset of the closed unit interval [0, 1] obtained by deleting a finite 
or countable number of points from X. Verify that T = (X, 7) is a topological 
space. Prove that T satisfies neither the second nor the first axiom of count- 
ability. Prove that T is a T,-space, but not a Hausdorff space. 


Problem 11, Let T be the topological space of the preceding problem. 
Prove that the only convergent sequences in T are the “stationary sequences,” 
i.e., the sequences all of whose terms are the same starting from some index 
n. Prove that the set M = (0, 1] has the point 0 as a contact point, but 
contains no sequence of points converging to 0. 


Problem 12. Prove the converse of Theorem 8. 


Comment. Hence a topological space T is a T,-space if and only if every 
finite subset of T is closed. 


Problem 13. Prove the following theorem, known as Urysohn’s lemma: 
Given a normal space T and two disjoint closed subsets F,, F, € T, there 
exists a continuous real function f such that 0 < f(x) < 1 and 


0 if xeF,, 


FO=|) it ver, 
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Problem 14. A T,-space T is said to be completely regular if, given any 
closed set F < T and any point x, ¢ T — F, there exists a continuous real 
function f such that 0 < f(x) < 1 and 


0 if x = Xs 


PON Ti ie cele 


(Completely regular spaces are also called Tychonoff spaces.) Prove that 
every normal space is completely regular, but not conversely. Prove that 
every subspace of a completely regular space (in particular, of a normal space) 
is completely regular. 


Comment. Thus, unlike normality, complete regularity is a hereditary 
property. It can be shown that a space is completely regular if and only if 
it is a subspace of a normal space.® Completely regular spaces are particularly 
important in analysis, since they “are able to support sufficiently many 
continuous functions,” i.e., for any two distinct points x and y of a completely 
regular space 7, there is a continuous real function on T taking distinct 
values at x and y. 


10. Compactness 


10.1. Compact topological spaces. The reader has presumably already 
encountered the familiar 


HEINE-BOREL THEOREM. Any cover of a closed interval [a, b] by asystem 
of open intervals (er, more generally, open sets) has a finite subcover. 


Generalizing this property of closed intervals, we are led to a key concept 
of real analysis: 


DEFINITION |. A topological space T is said to be compact if every open 
cover of T has a finite subcover. A compact Hausdorff space is called a 
compactum. 


Example. As we will see in Sec. 11.2, any closed bounded subset of 
Euclidean n-space R” is compact, for arbitrary n. On the other hand, R® 
itself (e.g., the real line or three-dimensional space) is not compact. 


DEFINITION 2. A system of subsets {A,} of a set T is said to be centered 


n 
if every finite intersection () A, is nonempty.’ 
k=1 


®J.L. Kelley, op. cit., p. 145. 
10 A system of sets with typical member A, will often be denoted by {A,} (this is still 


another use of curly brackets). 
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THEOREM 1. A topological space T is compact if and only if it has the 
following property: 

A) Every centered system of closed subsets of T has a nonempty 
intersection. 


Proof. Suppose T is compact, and let {F,} be any centered system of 
closed subsets of T. Then the sets G, = T — F, are open. Hence the fact 
that no finite intersection () F, is empty implies that no finite system of 


k=1 
sets G, = T — F,, covers T. But then the whole system of sets {G,} cannot 
cover T, by the compactness, and hence (} F, ~ @. In other words, 
T has property A) if T is compact. 
Conversely, suppose 7 has property A), and let {G,} be any open 
cover of T. Setting F, = T — G,, we find that () F, = @, which, by 


a 
property A), implies that the system F, is not centered, i.e., that there 
n 
are sets F,,..., F,, such that () F, = @. But then the corresponding 
k=1 
open sets G, = T — F, form a finite subcover of the cover {G,}. In 
other words, T is compact if T has property A). ff 
THEOREM 2. Every closed subset F of a compact topological space T is 
itself compact. 


Proof. Let {F,} be any centered system of closed subsets of the sub- 
space F © T. Then every F, is closed in Tas well, i.e., {F,} is a centered 
system of closed subsets of 7. Therefore () F, #4 @, by Theorem I. 


But then F is compact, by Theorem | again. fj 


CoroLiary. Every closed subset of a compactum is itself a compactum. 


Proof. Use Theorem 2 and the fact that every subset of a Hausdorff 
space is itself a Hausdorff space. 


THEOREM 3. Let K be a compactum and T any Hausdorff space con- 
taining K. Then K is closed in T. 


Proof. Suppose y ¢ K, so that ye T— K. Then, given any point 
x € K, there is a neighborhood U, of x and a neighborhood V, of y such 
that 


U, AV, = @. 
The neighborhoods {U,}(x € K) form an open cover of K. Hence, by the 
compactness of K, {U,} has a finite subcover consisting of sets U,,,..., 
U,,. Let 


V=Ve0 OV, 
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Then V is a neighborhood of the point y which does not intersect the set 
U,,U°+:UU,, > K, and hence y ¢ [K]. It follows that K is closed 


Remark. It is a consequence of Theorems 2 and 3 that compactness is 
an “intrinsic property,’’ in the sense that a compactum remains a compactum 
after being “embedded”’ in any larger Hausdorff space. 


THEOREM 4. Every compactum K is a normal space. 


Proof. Let X and Y be any two disjoint closed subsets of K. Re- 
peating the argument given in the proof of Theorem 3, we easily see that, 
given any point y © Y, there exists a neighborhood U, containing y and 
an open set O, > X such that U, 0 O, = ©. Since Y is compact, by 
Theorem 2, the cover {U,}(y € Y) of the set Y has a finite subcover 
U,,,.-.,U,,. The open sets 


wy? ? 
OY =O, 07°: N0,, 0M =U, U-:- UU, 


Yn 
then satisfy the normality conditions 
ow a> Xs Ov) D> Y, OM nO = gw. | 


10.2. Continuous mappings of compact spaces. Next we show that the 
“continuous image’’ of a compact space is itself a compact space: 


THEOREM 5. Let X be a compact space and f a continuous mapping of X 
onto a topological space Y. Then Y = f (X) is itself compact. 


Proof. Let {V,,} be any open cover of Y, and let U, = f-1(V,). Then 
the sets U,, are open (being preimages of open sets under a continuous 
mapping) and cover the space X. Since X is compact, {U,} has a finite 
suocover U,,,:.., Uz,. Then the sets V,,,..., Vz,, where V;, = f(U;,), 
cover Y. It follows that Yis compact. J 


THEOREM 6. A one-to-one continuous mapping of a compactum X 
onto a compactum Y is necessarily a homeomorphism. 


Proof. We must show that the inverse mapping f—" is itself continuous. 
Let F be a closed set in ¥ and P = f(F) its image in Y. Then P is a 
compactum, by Theorem 5. Hence, by Theorem 3, P is closed in Y. 
Therefore the preimage under f—1 of any closed set F < X is closed. It 
follows from Theorem 10’, p. 88 that f-! is continuous. J 


10.3. Countable compactness. We begin by proving an important property 
of compact spaces: 


THEOREM 7. If T is a compact space, then any infinite subset of T has 
at least one limit point. 
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Proof. Suppose T contains an infinite set X-with no limit point. Then 
T contains a countable set 


X = {%1, Xo, 64-9 Xps---f 
with no limit point. But then the sets 


Xn = {Xn Xnsis ++ -f (n= 1,2,...) 


form a centered system of closed sets in T with an empty intersection, 
ie., Tis not compact. J 


These considerations suggest 


DEFINITION 3. A topological space T is said to be countably compact 
if every infinite subset of T has at least one limit point (in T). 


Thus Theorem 7 says that every compact set is countably compact. The 
converse, however, is not true (see Problem 1). The relation between the 
concepts of compactness and countable compactness is made clear by 


THEOREM 8. Each of the following two conditions is necessary and 
sufficient for a topological space T to be countably compact: 


1) Every countable open cover of T has a finite subcover ; 
2) Every countable centered system of closed subsets of T has a non- 
empty intersection. 


Proof. The equivalence of conditions 1) and 2) is an immediate 
consequence of the duality principle. Moreover, if T is not countably 
compact, then, repeating the argument given in proving Theorem 7, 
we find that there is a countable centered system of closed subsets of T 
with an empty intersection. This proves the sufficiency of condition 2). 
Thus we need only prove the necessity of condition 2). Let T be 
countably compact, and let {F,} be a countable centered system of 
closed sets in T. Then, as we now show, f) F,, # @. Let 

n 


0, = 1 F,. 
=1 
Then none of the ®, is empty, since {F,,} is centered. Moreover, 


®,>0,>5---530,>°-:-, 


and 
No, =NF,,. 
n n 


There are now just two possibilities: 


1) ®,, = ®,,41 = +++ starting from some index no, in which case it 
is obvious that) ©, = ©,, A o. 
n 
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2) There are infinitely many distinct sets ®,. In this case, there is 
clearly no loss of generality in assuming that all the ®,, are distinct. 
Let x, € D, — ,,;. Then the sequence {x,,} consists of infinitely 
many distinct points of T, and hence, by the countable compact- 
ness of 7, must have at least one limit point, say x9. But then x, 
must be a limit point of ®,, since ®, contains all the points x,, 
Xni»--+ » Moreover x) € ®,, since ®, is closed. It follows that 


HEND,, ie, NO, Ag. I 


Thus compact topological spaces are those in which an arbitrary open 
cover has a finite subcover, while countably compact spaces are those in 
which every countable open cover has a finite subcover. Although in general 
countable compactness does not imply compactness, we have the following 
important special situation: 


THEOREM 9. The concepts of compactness and countable compactness 
coincide for a topological space T with a countable base. 


Proof. By Theorem 6, p. 83, every open cover @ of T has a countable 
subcover. Hence, if T is countably compact, @ has a finite subcover, by 
Theorem 8. 


Remark. The concept of a countably compact topological space, unlike 
that of a compact space, has not turned out to be very natural or fruitful. 
Its presence in mathematics can be explained in terms of a kind of “historical 
inertia.”” The point is that, as will be shown in the next section, the concepts 
of compactness and countable compactness coincide for metric spaces, as 
well as for spaces with a countable base. The notion of compactness was 
originally introduced in connection with metric spaces, with a compact metric 
space being defined as one in which every infinite subset has at least one 
limit point (i.e., in terms of what is now called “countable compactness”). 
The “automatic transcription” of this definition from metric spaces to 
topological spaces then led to the concept of a countably compact topological 
space. Sometimes, especially in the older literature, the word ‘“‘compact”’ 
is used in the sense of ‘‘countably compact,”’ and a topological space compact 
in our sense (i.¢., such that every open cover has a finite subcover) is said 
to be “‘bicompact.” In this older language, a compact Hausdorff space 
(a “‘compactum’”’ in our terminology) is called a “bicompactum,” and the 
term ‘“‘compactum”’ is reserved for a compact metric space. We will adhere 
to the terminology introduced in Definitions 1 and 3, often using the term 
“metric compactum’”’ to designate a compact metric space. 


10.4. Relatively compact subsets. Among the subsets of a topological 
space, those whose closures are compact are of special interest: 
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DEFINITION 4. A subset M of a topological space T is said to be rela- 
tively compact (in T) if its closure M in T is compact. 


Example 1. According to Theorem 2, every subset of a compact topo- 
logical space is relatively compact. 


Example 2. As we will see in Sec. 11.3, every bounded subset of the real 
line R! (or more generally of Euclidean n-space R") is relatively compact. 


A related concept is given by 


DEFINITION 5. A subset M of a topological space T is said to be rela- 
tively countably compact (in T) if every infinite subset A © M has at least 
one limit point in T (which may or may not belong to M). 


Relative compactness (unlike compactness) is not an “intrinsic property,” 
i.e., it depends on the space T in which the given set M is “embedded.” 
For example, the set of all rational numbers in the interval (0, 1) is relatively 
compact if regarded as a subset of the real line, but not if regarded as a subset 
of the space of all rational numbers. The concept of relative compactness 
is most important in the case of metric spaces (see Sec. 11.3). 


Problem 1. Let X be the set of all ordinal numbers less than the first 
uncountable ordinal. Let («, 8) < X denote the set of all ordinal numbers 
y such that a < y < B, and let the open sets in X be all unions of intervals 
(a, 8). Prove that the resulting topological space is countably compact but 
not compact. 


Problem 2. A topological space T is said to be locally compact if every 
point x € T has at least one relatively compact neighborhood. Show that a 
compact space is automatically locally compact, but not conversely. Prove 
that every closed subspace of a locally compact subspace is locally compact. 


Problem 3. A point x is said to be a complete limit point of a subset A of a 
topological space if, given any neighborhood U of x, the sets A and A NU 
have the same power (i.e., cardinal number). Prove that every infinite subset 
of a compact topological space has at least one complete limit point. 


Comment. Conversely, it can be shown that if every infinite subset of a 
topological space T has at least one complete limit point, then T is compact." 


Ii. Compactness in Metric Spaces 


11.1. Total boundedness. Since metric spaces are topological spaces of a 
special kind, the definitions and results of the preceding section apply to 


11 PP. S. Alexandroff, op. cit., pp. 250-251; J. L. Kelley, op. cit., pp. 163-164. 
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metric spaces as well. However, in the case of metric spaces, the concept 
of compactness is intimately connected with another concept, known as 
total boundedness. 


DEFINITION 1. Let R be a metric space and € any positive number. Then 
a set A © Ris said to be an s-net for aset M © Rif, for everyx eM, 
there is at least one point a & A such that e(x, a) < «. 


Example 1. The set of all points with integral coordinates is a (1 IV 2)-net. 
Example 2. Every subset of a totally bounded set is itself totally bounded. 


DEFINITION 2. Given a metric space Rand asubset M — R, suppose M 
has a finite e-net for every « > 0. Then M is said to be totally bounded. 


If a set M is totally bounded, then obviously so is its closure [M]. Every 
totally bounded set is automatically bounded, being the union of a finite 
number of bounded sets (recall Problem 5, p. 65). The converse is not true, 
as shown in Example 4. 


Example 3. In Euclidean n-space R”, total boundedness is equivalent to 
boundedness. In fact, if Mf < R is bounded, then M is contained in some 
sufficiently large cube Q. Partitioning Q into smaller cubes of side €, we find 


that the vertices of the little cubes form a finite (v/ne/2)-net for Q and hence 
(a fortiori) for any set contained in Q. 


Example 4. The unit sphere 2 in he, with equation 
Sx =1, 
n=) 
is bounded but not totally bounded. In fact, consider the points 
e,=(1,0,0,...), e =(0,1,0,...),..., 
where the nth coordinate of e,, is one and the others are all zero. These 
points all lie on Z, and the distance between any two of them is /2. Hence 
2 cannot have a finite e-net with « << J2/2. 
Example 5. Let Il be the set of points x = (m4, %,...,%,---) ink 
satisfying the inequalities 


1 1 
Ix] < 1, Valse ees esl Scan 


The set II, called the Hilbert cube (or fundamental parallelepiped)” furnishes 


#2 Another commonly encountered definition of the Hilbert cube is the set of points 
in /, satisfying the inequalities 


1 1 
Ix] <1, Ital <5p--+> lxnl <<-,... 
n 
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an example of an infinite-dimensional totally bounded set. The fact that II 
is totally bounded can be seen as follows: Given any ¢ > 0, choose # such 
that 
1 & 
gra <9? 
and with each point 
KS! (Kis Hoes e.) 
in II associate the point 


x* = (x1, Xg,.-- 5 Xn, 0,0,...) (1) 


(x* is also a point in II). Then 


Cy 2 4 1 € 
e(x, x*) = | xe< Se es 
poe 7 2. Pe Lay 


But the set II* of all points in IT of the form (1) is totally bounded, being 
a bounded set in n-space. Let A be a finite (e/2)-net in IT*. Then A is a finite 
e-net for the whole set IT. 


11.2. Compactness and total boundedness. We now show the connection 
between the concepts of compactness (of both kinds) and total boundedness: 


THEOREM 1. Every countably compact metric space R is totally bounded. 


Proof. Suppose R is not totally bounded. Then there is an &) > 0 
such that R has no finite ey-net. Choose any point a,¢ R. Then R 
contains at least one point, say a,, such that 


(41, az) > £9, 


since otherwise a, would be an éo-net for R. Moreover, R contains a 
point a, such that 
p(y, a3) > &; P(aa, a3) > Eo; 


since otherwise the pair a,, a, would be an e,-net for R. More generally, 
once having found the points a,, a,,...,4,, we choose a,,,€R such 
that 

P(Axs Anyi) > So (A =1,2,...,n). 
This construction gives an infinite sequence of distinct points a, d2,..., 
a,,... With no limit points, since (a, @,) > & if j 4k. But then R 
cannot be countably compact. 


Coroxiary 1. Every countably compact metric space has a countable 
everywhere dense subset and a countable base. 


Proof. Since Ris totally bounded, by Theorem 1, Rhasa finite (1/n)-net 
for every n = 1,2,... . The union of all these nets is then a countable 
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everywhere dense subset of R. It follows from Theorem 5, p. 82 that R 
has a countable base. §j 


CoroLiary 2. Every countably compact metric space is compact. 


Proof. An immediate consequence of Corollary 1 and Theorem 9, 
p. 96. 


According to Theorem 1, total boundedness is a necessary condition for 
a metric space to be compact. However, this condition is not sufficient. For 
example, the set of rational points in the interval [0, 1] with the ordinary 
definition of distance forms a metric space R which is totally bounded but 
not compact. In fact, the sequence of points 


0, 0.4, 0.41, 0.414, 0.4142, ... 


in R, i.e., the sequence of decimal approximations to the irrational number 
2 — 1, has no limit point in R. Necessary and sufficient conditions for 
compactness of a metric space are given by 


THEOREM 2. A metric space R is compact if and only if it is totally 
bounded and complete. 


Proof. To see that compactness of R implies completeness of R, 
we need only note that if R has a Cauchy sequence {x,} with no limit, 
then {x,} has no limit points in R. This, together with Theorem 1, 
shows that R is totally bounded and complete if R is compact. 

Conversely, suppose R is totally bounded and complete, and let {x,} 
be any infinite sequence of distinct points in R. Let N, be a finite 1-net 
for R, and construct a closed sphere of radius 1 about every point of Nj. 
Since these spheres cover R and there are infinitely many of them, at least 
one of the spheres, say S,, contains an infinite subsequence 


Ch) a) 
Nae Xoltsag arse 


of the sequence {x,,}. Let N, bea finite 4-net for R, and construct a closed 
sphere of radius 4 for every point of N,. Then at least one of these 
spheres, say S,, contains an infinite subsequence 


(2) (2) 
Ky siekin gs Karas 


of the sequence {x‘”}. Continue this construction indefinitely, finding 
a closed sphere S, of radius } containing an infinite subsequence 


(3) (3) 
Hy gy ssee gyn eo hs 


of the sequence {x‘?)}, and so on, where S,, has radius 1/2"-1. Let S’ be 
the closed sphere with the same center as S,, but with a radius r, twice as 
large (i.e., equal to 1/2”). Then clearly 


Sj> Sg, > °° 5S, >-¢5, 
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and moreover r, 0 as n—> 00. Since R is complete, it follows from 
the nested sphere theorem (Theorem 2, p. 60) that 


@ 
Ns, A go. 
n=1 
In fact, there is a point x, © R such that 
@ 
NS), = {xo} 
n=1 


(recall Problem 3, p. 65). Clearly x is a limit point of the original 
sequence {x,}, since every neighborhood of x» contains some sphere S, 
and hence some infinite subsequence {x}. Therefore every infinite 
sequence {x,,} of distinct points of R has a limit point in R. It follows that 
Ris countably compact and hence compact, by Corollary 2. J 


Example. As already noted, a subset M of Euclidean n-space R” is totally 
bounded if and only if it is bounded. Moreover, M is complete if and only if 
it is closed (recall Problem 7, p. 66). Hence, by Theorem 2, the set of all 
compact subsets of R” coincides with the set of all closed bounded subsets 
of R”. 


11.3. Relatively compact subsets of a metric space. The concept of relative 
compactness, introduced in Sec. 10.4 for subsets of an arbitrary topological 
space, applies in particular to subsets of a metric space. In the case of a 
metric space, however, there is no longer any distinction between relative 
compactness and relative countable compactness. 


THEOREM 3. A subset M of a complete metric space R is relatively 
compact if and only if it is totally bounded. 


Proof. An immediate consequence of Theorem 2 and the fact that a 
closed subset of a complete metric space is itself complete. J 


Example. Any bounded subset of Euclidean n-space it totally bounded 
and hence relatively compact (this is our version of the familiar Bolzano- 
Weierstrass theorem). 


Remark. The utility of Theorem 3 stems from the fact it is usually easier 
to prove that a set is totally bounded than to give a direct proof of its relative 
compactness. On the other hand, compactness is the key property as far as 
applications are concerned. 


11.4. Arzela’s theorem. The problem of proving the compactness of 
various subsets of a given metric space is encountered quite frequently in 
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analysis. However, the direct application of Theorem 2 is not always easy. 
This explains the need for special criteria serving as practical tools for proving 
compactness in particular spaces. For example, as we have seen, the bounded- 
ness of a set in Euclidean n-space implies its compactness, but this implication 
fails in more general metric spaces. 

One of the most important metric spaces in analysis is the function space 
Cta,vy» introduced in Example 6, p. 39. For subsets of this space, we have 
an important and frequently used criterion for relative compactness, called 
Arzela’s theorem, which will be stated and proved after first introducing two 
new concepts: 


DEFINITION 3. A family © of functions © defined on a closed interval 
[a, b] is said to be uniformly bounded if there exists a number K > 0 such 
that 
lo(x~)l < K 


for all x € [a, b] and all ge®. 


DeFINITION 4. A family ® of functions 9 defined on a closed interval 
[a, b] is said to be equicontinuous if, given any « > 0, there exists a number 
3 > 0 such that |x' — x"| < 8 implies 


le’) — eG") <« 
for all x’, x" & [a, b] and all pe ®. 


THEOREM 4 (Arzel&). A necessary and sufficient condition for a family 
® of continuous functions 9 defined on a closed interval [a, b] to be 
relatively compact in C,, 4, is that © be uniformly bounded and equi- 
continuous. 


Proof. We give the proof in two steps: 


Step I (Necessity). Suppose ® is relatively compact in C,, ,,. Then 
by Theorem 3, given any « > 0, there is a finite (¢/3)-net 91,..., Qn 
in ® (see Problem 1). Being a continuous function defined on a closed 
interval, each 9, is bounded: 


lo(ml<K, (a<x<b). 
Let 


K = max {Ky...,K,} +5. 
By the definition of an (e/3)-net, given any @ € ®, there is at least one , 
such that 


0( 5 9) = max |@(x) — o(x)] < =. 
axsenp 3 
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Therefore 
le) < le + 2 < K+ S< K, 


i.e., is uniformly bounded. Moreover, each function 9, in the (e/3)-net 
is continuous, and hence uniformly continuous, on [a, 5]. Hence, given 
any ¢ > 0, there is a 8, such that 


lei) — eecre)l <5 


whenever |x, — x2| < 8,;. Let 
3 = min {8,,...,8,}. 
Then, given any » € ® and choosing 9, such that e(@, 9,) < ¢/3, we have 


lo(x1) — (%2)| 
< 19(%1) — 9:(%1)| + 19%) — 9,(%2)| + 19,(%2) — 9(%2)| 


€ & € 
<>+-+->=€ 
3 - 3 7 3 
whenever |x, — x2| < 8. This proves the equicontinuity of ®. 


Step 2 (Sufficiency). Suppose ® is uniformly bounded and equi- 
continuous. According to Theorem 3, to prove that © is relatively com- 
pact in C,,,,, we need only show that ® is totally bounded, i.e., that 
given any «> 0, there exists a finite e-net for ® in C,, ,,. Suppose 
Io(x)| < K for all 9 € ®, and let 3 > 0 be such that 


lea) — 9G) < E 


for all ¢ © ® whenever |x, — x2| < 5. Divide the interval a <x <b 
along the x-axis into subintervals of length less than 8, by introducing 
points of subdivision x9, x1, X2,..., X, such that 


A=X<K HS XS <x, =|, 


and then draw a vertical line through each of these points. Similarly, 
divide the interval —K < y < K along the y-axis into subintervals of 
length less than e/5, by introducing points of subdivision yo, y1, Yas.» - > Yp 
such that 

—K=y<y <ye< 1s <y, = K, 


and then draw a horizontal line through each of these points. In this 
way, the rectanglea << x < b, —-M < y < M is divided into np cells of 
horizontal side length less than 8 and vertical side length less than ¢/5. 
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We now associate with each function ¢ € ® a polygonal line y = (x) 
which has vertices at points of the form (x;, y,) and differs from the 
function ¢ by less than ¢/5 at every point x, (the reader should draw a 
figure and convince himself on the existence of such a function). Since 


lee) — Yon <3, 


10(%441) — YO < ; > 


19(%%) — wl < = 
by construction, we have 
3 
VG) — Cal <=. 


Moreover, 
3 
Ibi) — 401 < = Cats 


since )(x) is linear between the points x, and x,,,. Let x be any point 
in [a, b] and x, the point of subdivision nearest to x on the left. Then 


1o(x) ~ $0) < le) — eG) + lo) — ¥Oad! +14) — YO) < €, 


ie., the set of polygonal lines }(x) forms an e-net for ©. But there 
are obviously only finitely many such lines. Therefore ® is totally 
bounded. § 


11.5. Peano’s theorem. Arzela’s theorem has many applications, among 
them the following existence theorem for differential equations: 


THEOREM 5 (Peano). Let f(x, y) be defined and continuous on a plane 
domain G. Then at least one integral curve of the differential equation 


— = f(x, y) (2) 
x 


passes through each point (Xo, Yo) of G. 
Proof. By the continuity of f, we have 
If yl < K 


in some domain G’ € G containing the point (x, yo). Draw the lines 
with slopes K and —K through the point (x9, yo). Then draw vertical 
lines x = a and x = b (a < xy < b) which together with the first two 
lines form two isosceles triangles contained in G’ with common vertex 
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(Xo, Yo), a8 Shown in Figure 13. This gives a closed interval [a, 5}, which 
will figure in the rest of the proof. 

The next step is to construct a family of polygonal lines, called Euler 
lines, associated with the differential equation (2). We begin by drawing 
the line with slope f(x, yo) through the point (x9, yo). Next, choosing a 
point (x;, y,) on the first line, we draw the line with slope f(x, y,) through 
the point (x, y1). Then, choosing a point (x2, yz) on the second line, we 
draw the line with slope f(x2, y,) through the point (x2, yz), and so on 
indefinitely. Suppose we construct a whole sequence L,, L2,...,Ly,... 
of such Euler lines going through the point (xo, yo), with the property 
that the length of the longest line segment making up L,, approaches 0 
asn — oo. Let ¢, be the function with graph L,. Then this gives a family 
of functions 9, 2,..- 5 Qn,-.-, all defined on the interval [a, 5], which 
is easily seen to be uniformly bounded and equicontinuous (why ?). It 
follows from Arzela’s theorem that the sequence {¢,} contains a uni- 
formly convergent subsequence o, p),..., p™,... Let 


9(x) =lim 9'"'(x). 
Then clearly ee 
(Xo) = Yo 
so that the curve y = (x) passes through the point (x9, yo). 
We now show that y = 9(x) satisfies the differential equation (2) in 


the open interval (a, 5). This means showing that, given any « > 0 and 
any points x’, x” € (a, b), we have 


2) — 9) = ee) —f(x', o(x')) | <e 


x" — 








whenever |x” — x’| is sufficiently small, or equivalently that 


(ner) ley 
BOD EO) _ Fx", ox) 


x —Xx 


<e (3) 
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whenever n is sufficiently large and |x” — x’| is sufficiently small. Let 
y’ = 9(x’). Then, by the continuity of f, given any « > 0, there is a 
number 7 > 0 such that 


LY) —e<f@y<fO ry) +e 
whenever 
Ix — x’] < 2y, ly—y'| <4Kn. 


The set of points (x, y) satisfying these inequalities is a rectangle, which 
we denote by Q. Let N be so large that for all n > N, the length of the 
longest segment making up L,, is less than y and moreover 


lo(x) — 9!(x)| < Kn. 
Then all the Euler lines L,, with n > N lie inside the rectangle Q (why ?). 
Suppose L,, has vertices (ao, 59), (G1, 51), «~~ » (Anas Oey1), Where! 
Aa <x <a <a <<a <x" < ayy. 
Then 
PM (ay) — P(X’) = f Ao, boar — x’), 
9 (Airs) — 9 (4,) = f(4, b)Giz1 — 4) (i=1,2,...,k—1), 
p(x”) — pl! (ay) = fay, 5,)(%" — a). 
Hence, if |x” — x’| < x, 
U(x’, v’) — el(ay — x’) < (ay) — 9 (XV < Oy) + ella — x’), 
Lf’, ¥') — Eliya — 4) < 9 (Gigs) — 9 (@,) 
< Uf’, y’) + eliza — 4) G= | ee ee »k— 1), 
f(s’) = e](x” as) < pl?) (3¢" — e') (a,)< U's) ae e](x” = a,). 
Adding these inequalities, we get 
[f(s Y') — eMC" — 2°) < 90%") — 9D < FEI" — ¥) 
if |x” — x’| < y, which is equivalent to (3). §j 


Remark. Different subsequences of a sequence of Euler lines may con- 
verge to different solutions of the differential equation (2). Hence the solution 
¢ found in the proof of Theorem 5 may not be the unique solution of (2) 
passing through the point (x9, yo). 


Problem 1. Let M be a totally bounded subset of a metric space R. Prove 
that the e-nets figuring in the definition of total boundedness of M can always 
be chosen to consist of points of M rather than of R. 





15 To be explicit, we assume that x” > x’. The case x” < x’ is treated similarly. 
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Hint. Given an e-net for M consisting of points a,, d:,...,a,€R, all 
within « of some point of M, replace each point a, by a point b, € M such 
that p(a,, b,) < «. 


Problem 2. Prove that every totally bounded metric space is separable. 


Hint. Construct a finite (1/n)-net for every n = 1,2,... Then take the 
union of these nets. 


Problem 3. Let M be a bounded subset of the space C,, ,,. Prove that the 
set of all functions 


Fx =] fat 
with f € M.compact. 


Problem 4. Given two metric compacta X and Y, let Cyy be the set of 
all continuous mappings of X into Y. Let distance be defined in Cyy by the 
formula 


o(f; 8) = sop o( f(x), g(x)). (4) 


Prove that Cyy is a metric space. Let Myy be the set of all mappings of 
X into Y, with the same metric (4). Prove that Cyy is closed in Myy. 


Hint. Use the method of Problem 1, p. 65 to prove that the limit of a 
uniformly convergent sequence of continuous mappings is itself a continuous 
mapping. 

Problem 5. Let X, Y and Cy be the same as in the preceding problem. 
Prove the following generalization of Arzela’s theorem: A necessary and 
sufficient condition for a set D © Cyy to be relatively compact is that 
D be an equicontinuous family of functions, in the sense that given any « > 0, 
there exists a number 5 > Osuch that o(x’, py’) < Simplies e(f(x’), f(x") < 
for all x’, x" € X and all fe D. 


Hint. To prove the sufficiency, show that D is relatively compact in 
Mxy (defined in the preceding problem) and hence in Cxy, since Cyy is 
closed in Myy. To prove the relative compactness of D in Myy, first 
represent X as a union of finitely many pairwise disjoint sets E; such that 
x', x" € E, implies p(x’, x”) < 8. For example, let x,,..., x, be a (8/2)-net 
for X, and let 

E; ee S[xi; 3] i U S[x; > 3). 
j<t 


Then let y,,...,y, be an e-net in Y, and let L be the set of all functions 
taking the values y, on the sets Z;. Given any fe D and any x, € {x1,..., xp}, 
let y; € {)i, -- + 5 Yn} be such that e(f(x,), y;) < ¢ and let g € L be such that 
2(x,) = y;. Show that p(f(x), g(x)) < 2e, thereby proving that L is a finite 
2e-net for Din Myy. 
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12. Real Functions on Metric and Topological Spaces 


12.1. Continuous and uniformly continuous functions and functionals. Let T 
be a topological space, in particular a metric space. Then by a real function 
on T we mean a mapping of T into the space R} (the real line). For example, 
a real function on Euclidean n-space R” is just the usual “function of n 
variables.”” Suppose 7 is a function space, i.e., a space whose elements are 
functions. Then a real function on T is called a functional. 


Example 1, Let x(t) be a function defined on the interval [0, 1], let 
(So, S1,---,5,) be a function of n + 1 variables defined for all real values 
of its arguments, and let L(t, u) be a function of two variables defined for 
all t € [0, 1] and all real u. Then the following are all functionals: 


F(x) = stp); 


F(x) = inf x(t), 


0<t<1 
F(x) = x(to) = where t,€ (0, 1], 
F,(x) = 9[x(to), x(4),--- > X(tn) 
F(x) =| Ue, x(0)] at 

Fe(x) = x'(to) where ty € [0, 1], 


Fx) = flee + x'%(2) dt, 
F(x) = fie") at. 


The functionals Fy, F,, Fs, F, and Fs are defined on the space C of all 
functions continuous on the interval [0, 1]. On the other hand, F, is defined 
only for functions differentiable at the point f), F, is defined only for functions 
such that the expression V1 + x'%(1) is integrable, and F;, is defined only for 
functions with integrable |x’(t)|. 


Example 2. The functional F, is continuous on C, since 


(x, y) = sup |x — yl, |sup x — sup y| < sup |x — yj. 

Example 3. The functional F, is discontinuous on C at any point x» where 
it is defined. In fact, let x(t) be such that x’(t,)) = 1 and |x(1)| < «, and let 
y= Xo +x. Then y'(to) = x5(4) + 1 even though (x9, y) < «. However, 
Fg is continuous if it is defined on the space C of all functions continuously 
differentiable on the interval [0, 1], with metric 

o(x, y) = sup [Ix(t) — y@| + |x’ — y'Oll 
o<t<i 


(why ?). 
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Example 4. The function F, is also discontinuous on C. In fact, let 
mn lees 
x(t) =0, x,(t) = = sin 2nnt. 
n 
Then 


(Xn Xo) = : — 0, 
n 


but F,(x,) > 4 for all while F,(x9) = 1. Hence F,(x,) fails to approach 
F,(%9) even though x, > Xp. 


The ordinary concept of uniform continuity generalizes at once to the 
case of arbitrary metric spaces: 


DEFINITION 1. A real function f(x) defined on a metric space R is said 
to be uniformly continuous on R if, given any « > 0, there isa 8 > 0 such 
that e(x, x2) < 8 implies | f(x) — f(2)| < € for all x1, x2 ER. 

The reader will recall from calculus that a real function continuous on a 
closed interval [a, b] is uniformly continuous on [a, 6]. This fact is a special 
case of 


THEOREM |. A real function f continuous on a compact metric space R 
is uniformly continuous on R. 


Proof. Suppose f is continuous but not uniformly continuous on R. 
Then for some positive ¢ and every n there are points x, and x; in R such 
that 

i heel 
(%n» Xn) < i (1) 
but 
[fCn) — f%n)] > € (2) 


Since Ris compact, the sequence {x,,} has a subsequence {x,,,} converging 
to a point x € R. Hence {x,,} also converges to x, because of (1). But 
then at least one of the inequalities 


[0 —fO mpl > 5+ 1F6) —F Ong) > F 


must hold for arbitrary k, because of (2). This contradicts the assumed 
continuity of fat x. ff 


12.2. Continuous and semicontinuous functions on compact spaces. As just 
shown, the theorem on uniform continuity of a function continuous on a 
closed interval generalizes to functions continuous on arbitrary metric 
compacta. There are other properties of functions continuous on a closed 
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interval which generalize to arbitrary compact spaces (not necessarily metric 
spaces): 
THEOREM 2. A real function f continuous on a compact topological 
space T is bounded on T.* Moreover f achieves its least upper bound and 
greatest lower bound on T. 


Proof. A continuous real function on T is a continuous mapping of 
T into the real line R’. The image of Tin R'is compact, by Theorem 5, 
p. 94. But every compact subset of R’ is bounded and closed (see p. 
101). Hence fis bounded on T. Moreover, f not only has a least upper 
bound and greatest lower bound on T, but actually achieves these bounds 
at points of T. J 


Theorem 2 can be generalized to a larger class of functions, which we 
now introduce: 


DEFINITION 2. A (real) function f defined on a topological space T is 
said to be upper semicontinuous at a point x, € T if, given any « > 0, there 
exists a neighborhood of xp in which f(x) < f(%o) + ¢. Similarly, fis said 
to be lower semicontinuous at xq if, given any « > 0, there exists a neighbor- 
hood of x9 in which f(x) > f (x9) — . 


Example 1. Let [x] be the integral part of x, i.e., the largest integer <x. 
Then f(x) = [x] is upper semicontinuous for all x. 


Example 2. Given a continuous function f, suppose we increase the value 
Sf (%>) taken by fat the point x). Then f becomes upper semicontinuous at Xo. 
Similarly, f becomes lower semicontinuous at x, if we decrease f(x). 
Moreover, f is upper semicontinuous if and only if —f is lower semicon- 
tinuous. These facts can be used to construct many more examples of 
semicontinuous functions. 


In studying the properties of semicontinuous functions, it is convenient 
to allow them to take infinite values. If f(x,) = +00, we regard fas upper 
semicontinuous at x9. The function f is also regarded as lower semicon- 
tinuous at x, if, given any h > 0, there is a neighborhood of x, in which 
f(x) > A. Similarly, if f(x) = —0o, we regard f as lower semicontinuous 
at x), and at the same time upper semicontinuous at x, if, given any h > 0, 
there is a neighborhood of x, in which f(x) < —h. 


We now prove the promised generalization of Theorem 2: 


14 A real function (or functional) f is said to be bounded on a set E if f(E) is contained 
in some interval [—C, C]. 
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THEOREM 2’. A finite lower semicontinuous function f defined on a 
compact topological space T is bounded from below. 


Proof. Suppose to the contrary that inf f(x) = —0o. Then there 
exists a sequence {x,} such that f(x,) < —n. Since T is compact, the 
infinite set E = {x,,X2,...,%,,--.} has at least one limit point xo. 
Since fis finite and lower semicontinuous at xo, there is a neighborhood 
U of x, in which f(x) > f (x9) — 1. But then U can only contain finitely 
many points of E, so that x, cannot be a limit point of E. J 


THEOREM 2”, A finite lower semicontinuous function f defined on a 
compact topological space T achieves its greatest lower bound on T. 


Proof. By Theorem 2’, inf f(x) is finite. Clearly, there exists a 
sequence {x,,} such that 


fe) < inf f(x) + 4. 
n 


By the compactness of 7, the set E = {x,, %2,...,Xn,.-.} has at least 
one limit point x9. If f(x) > inff, then, by the semicontinuity of fat x9, 
there is a neighborhood U of the point x, and a 5 > 0 such that f(x) > 
inf f + 8 for all x e U. But then U cannot contain an infinite subset of 
E, i.e., x9 cannot be a limit point of x9. It follows that f(x.) = inff. JJ 


Remark. Theorems 2’ and 2” remain true if the words “lower,’’ “below,” 
and “‘greatest’’ are replaced by “upper,”’ “above,” and “least.” The details 
are left as an exercise. 


We conclude this section with some useful terminology: 


DEFINITION 3. Given a real function f defined on a metric space R, the 
(finite or infinite) quantity 


Fx) = tim | sup reo} 


e>0 (xeS(xy-6) 
is called the upper limit of f at x9, while the (finite or infinite) quantity 


| inf feo} 


®ES(x9.2) 


f(%0) = lim 


&-0 
is called the lower limit of f at x9. The difference 


of (Xo) = fl%o) — F(X), 


provided it exists, is called the oscillation of f at xy. 





5 T.e., provided at least one of the numbers fl (x9), f(%q) is finite. 
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(a) (o) (c) 
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12.3. Continuous curves in metric spaces. Instead of mappings of a metric 
space into the real line, we now consider mappings of a subset of the real 
line into a metric space. More exactly, let P = f(t) be a continuous map- 
ping of the interval a < 1 < b into a metric space R. As ¢ “traverses’’ the 
interval from a to b, the point P = f(t) “traverses a continuous curve” in 
the space R. Before giving a formal definition corresponding to this rough 
idea of a “curve,”’ we make two key observations: 


1) The order in which points are traversed will be regarded as an essential 
property of a curve. For example, the set of points shown in Figure 
14(a) gives rise to two distinct curves when traversed in the two distinct 
ways shown in Figures 14(b) and 14(c). Similarly, the function shown 
in Figure 15(a), defined in the interval0 < ¢ < 1, determines a “curve” 
filling up the segment 0 < y < 1 of the y-axis, but this curve is traversed 
three times (twice upward and once downward) and hence is distinct 
from the segment 0 < y < 1 traversed just once from the point y = 0 
to the point y = 1. 


2) The choice of the parameter ¢ will be regarded as unimportant, 
provided a change in parameter does not change the order in which 
the points of the curve are traversed. Thus the functions shown in 
Figures 15(a) and 15(b) represent the same curve, even though a given 
point of the curve corresponds to different parameter values in the 
two cases. For example, the point A in Figure 15(a) corresponds to 
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two isolated points C and D on the t-axis, while in Figure 15(b) the 
same point A corresponds to an isolated point C and a whole line 
segment DE (note that the point on the curve does not move at all 
as ¢ traverses the segment DE). 


We now give a formal definition of a curve, embodying these qualitative 
ideas. Two continuous functions 


P=f(t'), P=g(t’), 
defined on intervals 
a < rt’ < b’, a’ < 7” < b” 
and taking values in a metric space R, are said to be equivalent if there exist 
two continuous nondecreasing functions 


t= ot), t=), 


defined on the same interval 


a<t<b, 
such that 
p(ay=a’, =p) = B, 
pa)=a", (6) =8" 
and 


f (g(t) = g(¥@)) forall re [a, d]. 


It is easy to see that this relation of equivalence is reflexive (fis equivalent 
tof), symmetric (if fis equivalent to g, then g is equivalent to f) and transitive 
(if fis equivalent to g and g is equivalent to h, then f is equivalent to /). 
Hence the set of all continuous functions of the given type can be partitioned 
into classes of equivalent functions (cf. Sec. 1.4), and each such class is said 
to define a (continuous) curve in the space R. 

For each function P = f(t’) defined on an interval [a’, b’], there is an 
equivalent function defined on the interval [a”, b”] = [0, 1]. In fact, we need 
only make the choice 


= et)=(' —a'jtt+a’, tr” = (t) =t. 


Thus every curve can be regarded as specified parametrically in terms of a 
function defined on the unit interval J = [0, 1]. By the same token, it is 
often convenient" to introduce the space C(/, R) of continuous mappings f 
of the interval J into the space R, equipped with the metric 


ef, 8) == sup ef (), g(t), (3) 


where 0 is the metric in the space R. 





16 Cf. Problems 7-12. 
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Problem 1. Let the functionals Fy, Fs, Fz, Fs, Fs; and the space C be the 
same as on p. 108. Prove that 


a) Fy, F; and F, are continuous on C; 
b) F,is continuous on Cifthe function ¢ is continuous in all its arguments ; 
c) F, is uniformly continuous on C. 


Define F,, F,, Fs and F, on a space larger than C. 


Problem 2. Let the functionals F,, Fs and the spaces C, C® be the same 
as on p. 108. Prove that 


a) F, is discontinuous on C; 
b) F, and F, are continuous on C®, 


Problem 3. Let M be the space of all bounded real functions defined on 
the interval [a, b], with metric e(f, g) = sup|f — g|. By the length of the 
curve 


y=f@) (@<x<b) 
is meant the functional 


L(f) = sup > Ve, — 3) + FO) —fOaw), 


where the least upper bound (which may equal + 00) is taken over all possible 
partitions of [a,b] obtained by introducing points of subdivision xo, * 
Xg,..-,X, such that 


A=X< xX << x, =D. 
Prove that 
a) For continuous functions 


Lif) = lim | Xeo = X41)" +f) —f (x:4))"5 


max |x;—2;_1|~0 


b) For continuously differentiable functions 
Lf) = [v1 +77) ax; 
c) The functional L(f) is lower semicontinuous on M. 


Problem 4. Let f,f and w be the same as in Definition 3. Prove that 


a) f is upper semicontinuous; 

b) fis lower semicontinuous ; 

c) fis continuous at xy if and only if —oo < f(x) = f(%) < 0, ie., if 
and only if wf (x) = 0. 7 
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Problem 5. Let K be a metric compactum and A a mapping of K into 
itself such that p(Ax, Ay) < p(x, y) if x #A y. Prove that A has a unique 
fixed point in K. Reconcile this with Problem 1, p. 76. 


Problem 6. Let K be a metric compactum and {f,(x)} a sequence of 
continuous functions on K, increasing in the sense that 


AG) < fil) << fale) <-: 


Prove that if {f,(x)} converges to a continuous function on K, then the 
covergence is uniform (Dini’s theorem). 


Problem 7. A sequence of curves {I’,} in a metric space R is said to 
converge to a curve I’ in R if the curves I’, and I’ have parametric repre- 
sentations 

P=f,(t) @O<t< 
and 

P=f(t) @O<r<), 
respectively, such that 


lim o(ff,) = 9, 


where 6 is the metric (3) of the space C(/, R) introduced on p. 113. Prove 
that if a sequence of curves in a compact metric space R can be represented 
parametrically by an equicontinuous family of functions on [0, 1], then the 
sequence contains a convergent subsequence. 


Hint. Use Problem 5, p. 107. 


Problem 8. Let I’ be a curve in a metric space R, with parametric repre- 
sentation 


P= f(t) (a<t< pb). 
By the Jength of I’ is meant the functional 


LOST sup > oC ft,_s). f(t), 


where p is the metric in R and the least upper bound (which may equal +00) 
is taken over all possible partitions of [a, 5] obtained by introducing points 
of subdivision fo, ty, fe, .-. 5 tn»... Such that 


€4=h<t<t<c-+<t,=b. 


Prove that L(I) is independent of the parametric representation of I. 
Suppose we choose a = 0, b = 1, thereby confining ourselves to parametric 
representations of the form 


P=f(t) O<t< 1). 
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Prove that L(f) is then a lower semicontinuous functional on the space 
C(/, R) introduced on p. 113. Equivalently, prove that if a sequence of 
curves {I’,,} converges to:a curve I, in the sense of Problem 7, then L(T’) 
does not exceed the smallest limit point (i.e., the lower limit) of the sequence 


{LT}. 


Problem 9. Given a metric space R with metric pe, let I’ be a curve in R 
of finite length S with parametric representation 


P=f(t) (a<t<b). 
Let s = 9(T) be the length of the arc 


P=f(t) (a<t<T) 


(where T < b), ie., the arc of I going from the “initial point” P, = f(a) 
to the “final point’”” P; = f(T). Then IP has a parametric representation 
of the form 


P = g(s) O<s< 5S), 
where g2(s) = f(971(s)) if @ is one-to-one. Prove that 
e(g(s1), 8(s2)) < |s1 — sol. 
Hint. The length of an arc is no less than the length of the inscribed chord. 


Problem 10. In the preceding problem, let t = s/S. Then I’ has a para- 
metric representation 


P = F(x) = (St) O<t<1) 


in terms of a function F defined on the unit interval [0,1]. Prove that 
F satisfies a Lipschitz condition of the form 


e(F (71), F(t2)) < S|t1 — tel. 


Suppose R is compact and let {I’,} be a sequence of curves, all of length 
less than some finite number M. Prove that {I',} contains a convergent 
subsequence, where convergence of curves is defined as in Problem 7. 


Problem 11. Given a compact metric space R, suppose two points A and B 
in R can be joined by a continuous curve of finite length. Prove that among 
all such curves, there is a curve of least length. 


Comment. Even in the case where R is a “smooth” (i.e., sufficiently 
differentiable) closed surface in Euclidean 3-space, this result is not amenable 
to the methods of elementary differential geometry, which ordinarily deals 
only with the case of “neighboring’’ points A and B. 


Problem 12. Let @ be the set of all curves in a given metric space R. 
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Define the distance between two curves I',, T, € @ by the formula 
o(T,, P2) = inf (4. fa); (4) 


where ¢ is the metric (3) in the space C(/, R), and the greatest lower bound 
is taken over all possible representations 


P=f() O<t< 1) (5) 
of T, and 
P=f,(t) O<t< 1) (6) 


of T',. Prove that the metric § makes @ into a metric space. 


Comment. The fact that 6(I',, I’) = 0 implies the identity of [, and T, 
follows from the (not very easily proved) fact that the greatest lower bound 
in (4) is achieved for a suitable choice of the parametric representations (5) 
and (6). 


4 


LINEAR SPACES 


13. Basic Concepts 


13.1. Definitions and examples. One of the most important concepts in 
mathematics is that of a Jinear space, which will play a key role in the rest 
of this book: 


DEFINITION 1. A nonempty set L of elements x, y, z,... is said to be a 
linear space (or vector space) if it satisfies the following three axioms: 


1) Any two elements x,ye€L uniquely determine a third element 

x +yeEL, called the sum of x and y, such that 

a) x +y=y +x (commutativity); 

b) x+y) +2=x+(—+4+ 2) (associativity); 

c) There exists an element 0 € L, called the zero element, with the 
property that x + 0 = x for every x € L; 

d) For every x € L there exists an element —x, called the negative 
of x, with the property that x + (—x) = 0; 


2) Any number « and any element x € L uniquely determine an element 
ax € L, called the product of « and x, such that 
a) a(@x) = (af); 
b) lx = x; 


3) The operations of addition and multiplication obey two distributive 
laws: 
a) (a + B)x = aa + Bx; 
b) a(x + y) = ax + ay. 
118 
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Remark. The elements of L are called “‘points’’ or “vectors,” while the 
numbers a, 8, ... are often called “scalars.” If « is an arbitrary real number, 
L is called a real linear space, while if « is an arbitrary complex number, L 
is called a complex linear space.’ Unless the contrary is explicitly stated, the 
considerations that follow will be valid for both real and complex spaces. 
Clearly, any complex linear space reduces to a real linear space if we allow 
vectors to be multiplied by real numbers only. 


We now give some examples of linear spaces, leaving it to the reader 
to verify in detail that the conditions in Definition 1 are satisfied in each case.? 


Example 1. The real line (the set of all real numbers) with the usual 
arithmetic operations of addition and multiplication is a linear space. 


Example 2. The set of all ordered n-tuples 
X = (%1, Xa, -- 6 s Xn) 


of real or complex numbers x;, x2, ... , X,, With sums and “scalar multiples” 
defined by the formulas 


(4, Xer-205 Xn) + Ov Je tee Yn) = (x, +i X2 + Je, ee Xn + Yn) 
a(%1, X25 see Xn) = (ax, Xe, cee Xn), 


is also a linear space. This space is called n-dimensional (vector) space, or 
simply n-space, denoted by R” in the real case and C” in the complex case. 


(Concerning the precise meaning of the term “n-dimensional,” see Sec. 
13.2.) 


Example 3. The set of all (real or complex) functions continuous on an 
interval [a, b], with the usual operations of addition of functions and multi- 
plication of functions by numbers, forms a linear space C,,»), one of the 
most important spaces in analysis. 


Example 4, The set /, of all infinite sequences 


Kegs Nasa s ges 3) (1) 


of real or complex numbers x, X2,...,X,,... satisfying the convergence 
condition 


~ 
>Ix1? < , 
k=1 


1 More generally, one can consider linear spaces over an arbitrary field. 

"It will be noted that certain symbols like R*, Ci, 4}, /2 and m are used here with 
somewhat different meanings than in Sec. 5.1. The point is that there is no metric here, 
at least for the time being, while on the other hand, sums and scalar multiples of vectors 
were not defined in Chaps. 2 and 3. 
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equipped with operations 
(X15 Xo9 666 Xess) HV Varese Vere ed 
= (%1 + Yi, Xo + Yar Xe + Yess) 
O(%1, Nay ey Nyy e+) = (UX, OXq,... , AXy,~ «ds (2) 
is a linear space. The fact that 
Dla? < 2, Zlyel” < 0 


. . k=1 
implies 


foo} 
> |x + yl? < 00 
k=1 
is an immediate consequence of the elementary inequality 


(+ Ye)? < 2xE + Yes 
Example 5. Let c be the set of all convergent sequences (1), cy the set of 
all sequences (1) converging to zero, m the set of all bounded sequences, 
and R® the set of all sequences (1). Then c, co, mand R® are all linear spaces, 
provided that in each case addition of sequences and multiplication of 
sequences by numbers are defined by (2). 


Since linear spaces are defined in terms of two operations, addition 
of elements and multiplication of elements by numbers, it is natural to 
introduce 


DEFINITION 2. Two linear spaces L and L* are said to be isomorphic if 
there is a one-to-one correspondence x<+ x* between L and L* which 
preserves operations, in the sense that 


xeorx*,  yeooy* 
(where x, y € L, x*, y* € L*) implies 
x + yoox* + y* 
and 


Ox <> ax* 
(a an arbitrary number). 


Remark. It is sometimes convenient to regard isomorphic linear spaces 
as different ‘“‘realizations’’ of one and the same linear space. 


13.2. Linear dependence. We say that the elements x, y,..., w of a linear 
space L are linearly dependent if there exist numbers «, 8B, ..., A, not all zero, 
such that? 


ax + By +-+-+Aw =0. (3) 


3 The left-hand side of (3) is called a linear combination of the elements x, y,..., W. 
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If no such numbers exist, the elements x, y,..., w are said to be /inearly 
independent. In other words, the elements x, y,...,w are linearly inde- 
pendent if and only if (3) implies 

a=B=-+-=A=0. 
More generally, the elements x, y, .. . belonging to some infinite set Ec L 
are said to be linearly independent if the elements belonging to every finite 
subset of £ are linearly independent. 

A linear space L is said to be n-dimensional (or of dimension n) if n linearly 
independent elements can be found in L, but any n + 1 elements of L are 
linearly dependent. Suppose 7 linearly independent elements can be found 
in L for every n. Then L is said to be infinite-dimensional, but otherwise L 
is said to be finite-dimensional. Any set of n linearly independent elements of 
an n-dimensional space L is called a basis in L. 


Remark. The typical course on linear algebra deals with finite-dimensional 
linear spaces. Here, however, we will be primarily concerned with infinite- 
dimensional spaces, the case of greater interest from the standpoint of 
mathematical analysis. 


13.3. Subspaces. Given a nonempty subset L’ of a linear space L, suppose 
L is itself a linear space with respect to the operations of addition and multi- 
plication defined in L. Then L’ is said to be a subspace (of L). In other 
words, we say that L’ < Lisa subspace if x € L’, y € L’ implies ax + By € L’ 
for arbitrary « and 8. The “trivial space’’ consisting of the zero element alone 
is a subspace of every linear space L. At the opposite extreme, L can always 
be regarded as a subset of itself. By a proper subspace of a linear space L, 
we mean a subspace which is distinct from L itself and contains at least 
one nonzero element. 


Example 1. Let L be any linear space, and x any nonzero element of L. 
Then the set {Ax} of all scalar multiples of x, where 4 ranges over all (real or 
complex) numbers is obviously a one-dimensional subspace of L, in fact a 
proper subspace if the dimension of L exceeds 1. 

Example 2. The set P,, »; of all polynomials on [a, 5] is a proper subspace 
of the set Cj,» of all continuous functions on [a, b]. Like C,,_»; itself, Pra») 
is infinite-dimensional. At the same time, Cj») is itself a proper subspace of 
the set of all functions on [a, b], both continuous and discontinuous. 

Example 3. Each of the linear spaces /,, co, c, m and R®™ (in that order) 
is a proper subspace of the next one. 


Given a linear space L, let {x,} be any nonempty set of elements x, € L. 
Then L has a smallest subspace (possibly L itself) containing {x,}.4 In fact, 





4 Here we use curly brackets in the same way as in footnote 10, p. 92. 
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there is at least one such subspace, namely L itself. Moreover, it is clear 
that the intersection of any system of subspaces {L,} is itself a subspace, 
since if L* = f) L, and x,.y € L*, then ax + By € L* for all « and 6 (why 2). 


e 
The smallest subspace of L containing the set {x,} is then just the intersection 
of all subspaces containing {x,}. This minimal subspace, denoted by L({x,}), 
is called the (Jinear) subspace generated by {x,} or the linear hull of {x,}. 


13.4, Factor spaces. Let L be a linear space and L’ a subspace of L. 
Then two elements x, y € ZL are said to belong to the same (residue) class 
generated by L' if the difference x — y belongs to L’. The set of all such 
classes is called the factor space (or quotient space) of L relative to L’, denoted 
by L/L’. The operations of addition of elements and multiplication of elements 
by numbers can be introduced in a factor space L/L’ in the following natural 
way: Given two elements of L/L’, i.e., two classes & and y, we choose a 
“representative”? from each class, say x from & and y from y. We then 
define the sum § + » of the classes € and » to be the class containing the 
element x + y, while the product «& of the number « and the class & is 
defined to be the class containing the element ax. Here we rely on the fact 
that the classes € + y and «& are independent of the choice of the “repre- 
sentatives” x and y (why ?). 


THEOREM 1. Every factor space L/L’, with operations defined in the 
way just described, is a linear space. 


Proof. We need only verify that L/L’ satisfies the three axioms in 
Definition 1. This is almost trivial (give the details). 


Let L be a linear space and L’ a subspace of L. Then the dimension of 
the factor space L/L’ is called the codimension of L’ in L. 


THEOREM 2. Let L' be a subspace of a linear space L. Then L’ has finite 
codimension n if and only if there are linearly independent elements x,,... , 
Xn in L such that every element x € L has a unique representation of the 

form 
X= OX, +++ +a,x, +), (4) 


where a ,...,%, are numbers andy e€ L’, 

Proof. Suppose every element x € L has a unique representation of the 
form (4). Given any class — € L/L’, let x be any element of &, and let 
&,, be the class containing x, (k = 1,...,n). Then (4) clearly implies 


= HE tit t+ onbn 


Hence &,,..., &, is a basis for L/L’ (the linear independence of Z,,..., 
€_, follows from that of x,,...,x,). In other words, L/L’ has dimension 
n, or equivalently L’ has codimension n. 
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Conversely, suppose L’ has codimension n, so that L/L’ has dimension 
n. Then L/L’ has a basis &,,..., &,. Given any x € L, let & be the class 
in L/L’ containing x. Then 
E= mb +--+ +4,8, 


for suitable numbers «,,...,%,. But this means that every element in 
&, in particular x, differs only by an element y € L’ from a linear com- 
bination of elements x,,...,x, Where x, is any fixed element of 
&, (k =1,...,n), ie., 
XOX ti fare ty (yeL) (5) 
(the linear independence of x,,... , X, follows from that of &,..., &,). 
Suppose there is another such representation 
X= YX bo + eX, ty = (y' EL). (5’) 
Then, subtracting (5’) from (5), we get 
0 = (a — my) +°°°+(en—a,)+y” (y"EL), 
and hence 
O = (a — abr +o °* + (on — ondEns 


where in the last equation 0 means the class containing the zero element 
of L, ie., the space L’ itself. But &),..., &, are linearly independent, 
and hence 4, = @,...,4,=a,. 


13.5. Linear functionals. A numerical function f defined on a linear space 
L is called a functional (on L).® A functional f is said to be additive if 


fx +y) =fO) +f) 


for all x, y € L and homogeneous if 


Sf (ox) = af (*) 
for every number «. A functional defined on a complex linear space is called 
conjugate-homogeneous if 


flax) = af (x) 


for every number «, where % is the complex conjugate of «. An additive 


5 The word “functional” has already been used in a somewhat different sense in Sec. 
12.1, where a functional means a real function defined on a function space (topological 
or metric). Later on, we will deal with linear spaces which are also metric spaces and 
have functions as their elements. The two uses of the word “functional” will then coincide 


(if we allow complex-valued functionals). 
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homogeneous functional is called a /inear functional, while an additive 
conjugate-homogeneous functional is called a conjugate-linear functional. 


Example 1. Let R” be: feal n-space, with elements x = (x,..., x,), and 
let a = (a,,...,4,) be a fixed element of R”. Then 


f(%) =D am, 
k=1 
is a linear functional on R®. Similarly, 
Sf) =D aX 
k=1 


is a conjugate-linear functional on complex n-space C”. 


Example 2. Consider the integral 


U(x) = f'x(0) dt, 


or more generally 
U(x) = fx oo at, 


where ¢(f) is a fixed continuous function on [a, 5]. It follows at once from 
elementary properties of integrals that /(x) is a linear functional. Similarly, 
the integral 


Kx) = [3 dt, 
or more generally 


Kx) = [Ooo at, 


is a conjugate-linear functional on C,,,,). 
Example 3. Another kind of linear functional on the space Cj,,5) is the 
functional 
3,,(%) = X(to), 
which assigns to each function x(t) € Cj,,,) its value at some fixed point 
ty € [a, b]. In mathematical physics, particularly in. quantum mechanics, this 
functional is often written in the form 


5,(0) = Pxcmoe — ty) dt, 


where 3(t — 19) is a “‘fictitious’’ or “‘generalized’”’ function, called the (Dirac) 
delta function, which equals zero everywhere except at ¢ = 0 and has an 
integral equal to 1.6 As we will see in Sec. 20.3, the delta function can be 





6 Clearly, no “‘true” function can have these properties! 
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represented as the limit, in a suitable sense, of a sequence of “true” functions 
,» each vanishing outside of some ¢,-neighborhood of the point t = 0 and 
satisfying the condition 


Pont) dt =1 
(c, +0 asn— oo), 
Example 4. Let n be a fixed positive integer, and let 
KS Kaa pa es) 
be an arbitrary element of /,. Then 


frlX) = Xn 


is obviously a linear functional on /,. The same functional can be defined 
on other spaces whose elements are sequences, e.g., on the spaces Cy, c, m 
and R® considered in Example 5, p. 120. 


13.6. The null space of a functional. Hyperplanes. Let f be a linear func- 
tional defined on a linear space L. Then the set L, of all elements x € L such 
that 

f(x) =0 
is called the null space of f. It will be assumed that fis nontrivial, i.e., that 
f(x) € 0 for at least one (and hence infinitely many) x € L, so that the set 
L — L,is nonempty. Obviously L, is a subspace of L, since x, y € L, implies 


f(ax + By) = of (x) + BF) =0. 


THEOREM 3. Let xq be any fixed element of L — L,. Then every element 
x €L has a unique representation of the form 


xX = AX +); 
where y € L,. 


Proof. Clearly f(x,) 4 0, and in particular x) 4 0. There is no loss 
of generality in assuming that f(x) = 1, since otherwise we need only 
replace x» by X9/ f(x), noting that 


y=X— aX, 


Given any x € L, let 


where 
a = f(x). 


Then y € L,, since 


Sy) =f (% — x9) = f(x) — of (%) =f) — « =0. 
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Thus 
x= arty (yeL,). (6) 


Moreover, the representation (6) is unique. In fact, if there is another 
such representation 


x=a'xoty (yeEL,), (6’) 
then, subtracting (6’) from (6), we get 
(a — a')xo = y' — y. 
If « = «’, then obviously y’ = y. On the other hand, if « ~ «’, then 





yy 
Xo = eL 
OT aa? 


contrary to the choice of x5. 


CoROLLaRyY 1. Two elements x, and x, belong to the same class gener- 
ated by L, if and only if f (x1) = f(x). 
Proof. It follows from 
X1 = f(%1)%0 + Ya, 
X_ = f(%2)%o + Ye 
Xy — X_ = (fF (x1) — f(%2))xo + Or — Je). 


Hence x, — x, € L, if and only if the coefficient of x, vanishes. 


that 


COROLLARY 2. L, has codimension 1. 


Proof. Given any class & generated by L,, let x be any element of & 
and choose f(x)x = aX, as the “‘representative” of & By Corollary 1, 
this representative is unique, and there is obviously a nonzero class 
since X» ~ 0 and f(x) #40 for some x € L. Moreover, given any two 
distinct classes £& and y with representatives ax, and Axo, respectively, 
we have 


Bax») — a(Bxo) = 0 


BE — ay = 0, 


where at least éne of the numbers «, § is nonzero (why ?). Therefore any 
two distinct elements of L/L, are linearly dependent. It follows that 
L/L, is one-dimensional, i.e., L, has codimension |. jj 


and hence 


COROLLARY 3. Two nontrivial linear functionals f and g with the same 
null space are proportional. 
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Proof. Again let x, be such that f(x») = 1. Then g(x) 4 0. In fact, 
x=f()%+y Wel,), 


&(x) = f(x)g(%0) + 80) = f)EO); 


since L, = L,. But then g(x9) = 0 would imply that g is trivial, contrary 
to hypothesis. It follows that 


&(x) = 8X0) f), 
i.e., g(x) is proportional to f(x) with constant of proportionality g(x). 


and hence 


Given a linear space L, let L’ < L be any subspace of codimension 1. 
Then every class in L generated by L’ is called a hyperplane ‘‘parallel to L” 
(in particular, L’ itself is a hyperplane containing 0, i.e., “going through the 
origin’). In other words, a hyperplane M’ parallel to a subspace L’ is the 
set obtained by subjecting L’ to the parallel displacement (or shift) determined 
by the vector x) € L, so that’ 


M=L'+x9= {xix=x +y, VEL. 
It is clear that M’ = L’ if and only if x) ¢L’. We can now give a simple 
geometric interpretation of linear functionals: 


THEOREM 4. Given a linear space L, let f be a nontrivial linear functional 
on L. Then the set M, = {x:f(x) = 1} is a hyperplane parallel to the null 
space L, of the functional. Conversely, let M' = L’ + xq (Xo ¢ L') be any 
hyperplane parallel to a subspace L’ — L of codimension | and not passing 
through the origin. Then there exists a unique linear functional f on L such 
that M’ = {x:f (x) = |}. 


Proof. Given f, let xy be such that f(x9) = 1 (such an x, can always 
be found). Then, by Theorem 3, every vector x € M, can be represented 
in the form x = x) + y, where y € L,. 

Conversely, given M’ = L’ + x, (xo € L’), it follows from Theorem 2 
and its proof that every element x € L can be uniquely represented in the 
form x = «xy + y, where ye L’. Setting f(x) = «, we get the desired 
linear functional. The uniqueness of f follows from the fact that if 
g(x) = 1 for x eM’, then g(y) = 0 for y € L’ (why ?), so that 


glaxg ty) =a=f(arx+y). Ff 


Remark. Thus we have established a one-to-one correspondence be- 
tween the set of all nontrivial linear functionals on L and the set of all 
hyperplanes in L which do not pass through the origin. 





7 The expression on the right is shorthand for the set of all x such that x = xy + y, 
y €L’ (the colon is read “‘such that’). Similarly, {x : f(x) = 1} is the set of all x such that 
{() = 1, and so on, 
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Problem 1. Prove that the set of all polynomials of degree n — 1 with 
real (or complex) coefficients is a linear space, isomorphic to the n-dimensional 
vector space R” (or C”). 


Problem 2. Verify that R* and C” are n-dimensional, as anticipated by the 
terminology in Example 2, p. 119. 


Problem 3. Verify that the spaces Ciq5}, /2, ¢, Co, m and R® are all 
infinite-dimensional. 


Problem 4. Given a linear space L, a set {x,} of linearly independent 
elements of Lis said to be a Hamel basis (in L) if the linear subspace generated 
by {x,} coincides with L. Prove that 
a) Every linear space has a Hamel basis; 
b) If {x,} is a Hamel basis in L, then every vector x € L has a unique 
representation as a finite linear combination of vectors from the set 
{xu}; 

c) Any two Hamel bases in a linear space L have the same power 
(cardinal number), called the algebraic dimension of L; 

d) Two linear spaces are isomorphic if and only if they have the same 

algebraic dimension. 


Problem 5. Let L’ be a k-dimensional subspace of an n-dimensional linear 
space L. Prove that the factor space L/L’ has dimension n — k. 


Problem 6. Let fifi, ...,f, be linear functionals on a linear space L such 
that A(x) =--- =f,(%) = 0 implies f(x) = 0. Prove that there exist con- 
stants a,,..., 4, such that 


fs) = Safi) 


for every x € L. 


14. Convex Sets and Functionals. The Hahn-Banach Theorem 


14.1. Convex sets and bodies. Many important topics in the theory of 
linear spaces rely on the notion of convexity. This notion, stemming from 
intuitive geometric ideas, can be formulated purely analytically. Given a 
real linear space L, let x and y be any two points of L. Then by the (closed) 
segment in L joining x and y we mean the set of all points in L of the form 
ax + By where «a, 8 > 0 and « + 8 =1. Such a segment minus its end 
points x and y is called an open segment. By the interior of a set M < L, 
denoted by J(M), we mean the set of all points x € M with the following 
property: Given any ye L, there exists a number ¢ = e(y) > 0 such that 
x+tye Mif |t]<e. 
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DEFINITION 1. A set M © L is said to be convex if whenever it contains 
two points x and y, it also contains the segment joining x and y. 


DEFINITION 2. A convex set is called a convex body if its interior is 
nonempty. 


Example 1. The cube, ball, tetrahedron and half-space are all convex 
bodies in three-dimensional Euclidean space R°. On the other hand, the 
line segment, plane and triangle are convex sets in R°, but not convex bodies. 

Example 2. As usual, let C,,,,, be the space of all functions continuous on 
the interval [a, b], and let M be the subset of C,,,,; consisting of all functions 
satisfying the extra condition 


If@| <1. 
Then &M is convex, since 
IfM<1,  Ie@l<1 
together with «, B > 0, « + 8 = | implies 
lof(t) + BeOl<a+B=1. 


Example 3. The closed unit sphere in /y, i.e., the set of all points x = 
(1, Xo, --+5Xn,---) Such that 
eG 
n=1 


is a convex body. Its interior consists of all points x = (x1, X%2,..., Xn. ++) 
satisfying the condition 


ceo) 
2 Xn =e ie 
es 


Example 4. The Hilbert cube IT (see Example 5, p. 98) is a convex set in 
/,, but not a convex body. In fact, 





1 
Ixnl < ra 


1 1 
= PS ees 586 
Yo ( 5 Z ) 


and suppose x + ty, Ell, ie., 


ifx eT. Let 


i 


Qn-l Z 








t 
Xo ‘| < 

n 
Then 








gral Qr-1 = gn-2 





FIs tt 1+ bal < sa + : : 

n n 

for all n = 1,2,..., which implies ¢ = 0. Therefore the interior of II is 
empty. 
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THEOREM 1. If M is a convex set, then so is its interior I(M). 


Proof. Suppose x, y €1(M), and let z = ax + By, a, B > 0, a+ 
8 = 1. Then, given any a € L, there are numbers ¢, > 0, eg > 0 such 
that the points x + 4,4, y + t,a belong to M if |t,| << &, |t| < e. 
Therefore 
a(x + ta) + B(y + ta) =z + 1a 


belongs to M if |t| < « = min {e,, e,},ie.,zE1(M). 
THEOREM 2. The intersection 
M=MM, 
a 
of any number of convex sets M,, is itself a convex set. 


Proof. Let x and y be any two points of M. Then x and y belong to 
every M,, and hence so does the segment joining x and y. But then the 
segment joining x and y belongs to M. jj 


Given any subset A of a linear space L, there is a smallest convex set 
containing A, i.e., the intersection of all convex sets containing A (there 
is at least one convex set containing A, namely L itself). This minimal 
convex set containing A is called the convex hull of A. For example, the 
convex hull of three noncollinear points is the triangle with these points as 
vertices. 


14.2. Convex functionals. Next we introduce the important concept of a 
convex functional: 


DEFINITION 3. A functional p defined on a real linear space L is said to 
be convex if 


1) p(x) > 0 for all x € L (nonnegativity); 
2) p(ax) = ap(x) for all x € L andalla > 0; 
3) p(x + y) < p(x) + p(y) for all x, ye L. 


Remark. Here, unlike the case of linear functionals, we do not assume 
that p(x) is finite for all x € L, i.e., we allow the case where p(x) = +00 
for some x € L. 


Example 1. The length of a vector in Euclidean n-space R” is a convex 
functional. The first and second conditions are immediate consequences of 
the definition of length in R” (length is inherently nonnegative), while the 
third condition means that the length of the sum of two vectors does not 
exceed the sum of their lengths (the triangle inequality). 
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Example 2. Let M be the space of bounded functions of x defined on some 
set S, and let s, be a fixed point of S. Then 


Ps, (x) = |x(s0)| 
is a convex functional. 
Example 3. Let m be the space of bounded numerical sequences x = 
(X%1, Xe, .--5X,z,..-). Then the functional 
p(x) = sup xxl 
is convex. 


14.3. The Minkowski functional. Next we consider the connection be- 
tween convex functionals and convex sets: 


THEOREM 3. If p is a convex functional on a linear space L and k is any 
positive number, then the set 


E= {x:p(x) << kK} 
is convex. If p is finite, then E is a convex body with interior 
ICE) = {x:p(x) < k} 
(so that in particular 0 € I(E)). 
Proof. fx, yEE, «4,8 > 0,a +8 = 1, then 


Plax + By) < ap(x) + BpQ) < k, 
ie., Eis a convex set. Now suppose p is finite, and let p(x) << k,t > 0, 
yeL. Then 
p(x + ty) < p(x) + tp(+y). 
If p(—y) = p(y) = 0, then x + ty € E for all ¢. On the other hand, if at 
least one of the numbers p(y), p(—y) is nonzero, then x + ty € Eif 
Pee Sant Jt.) Eee 
max {p(y), p(—Yy)} 


Suppose we choose a definite value of k, say k = 1. Then every finite 
convex functional p uniquely determines a convex body EF in L, such that 
0 € (E). Conversely, suppose E is a convex body whose interior contains 
the point 0, and consider the functional 


p,(x) = inf [rte E,r> 0 (1) 


called the Minkowski functional of the convex body E. Then we have 


THEOREM 4. The Minkowski functional (1) is finite and convex. 
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Proof. Given any x € L, the element x/r belongs to E if r is suffi- 
ciently large (why ?), and hence p,(x) is nonnegative and finite. Clearly 
Px(0) = 0. If « > 0, then 


Pp(ax) = int [ >0: Sex| = int {a > 0: *e8| 
is r 
ma (r > 0: 78 = apz(x). (2) 
" 


Next, given any « > O and any x, x, € L, choose numbers r; (i = 1, 2) 
such that 
Pal) <1 < P(X) + €. 


Then x,/r,¢ E. [fr =r, + re, then 


Xp +X, 1X1 | Waxes 
© 51 i) 
r rry rr, 


belongs to the segment with end points x,/r, and x,/r,. Since E is convex, 
this segment and hence the point (x, + x,)/r belongs to E. It follows that 


Pal + %) <r Sry t+ re < Peo(%1) + Pe(%2) + 2¢ 
or 
Pa(% + %2) < Pe(%1) + Pe), (3) 


since ¢ is arbitrary. Together (2) and (3) imply that pp(x) is convex. 


13.4, The Hahn-Banach theorem. Given a real linear space L and any 
subspace Ly < L, let fy be a linear functional defined on Ly. Then a linear 
functional f defined on the whole space L is said to be an extension of the 
functional f, if 

f[@=f/(x) forall xeLy. 


A problem frequently encountered in analysis is that of extending an arbitrary 
linear functional, originally defined on some subspace, onto a larger space. 
A central role in problems of this kind is played by 


THEOREM 5 (Hahn-Banach). Let p be a finite convex functional defined 
on a real linear space L, and let Ly be a subspace of L. Suppose fy is a 
linear functional on Ly satisfying the condition 


fo(x) < p(x) (4) 


on Ly. Then fy can be extended to a linear functional on L satisfying (4) 
on the whole space L. More exactly, there is a linear functional f defined 
on L and equal to fy at every point of Ly, such that f(x) < p(x) on L. 
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Proof. Suppose Ly # L, since otherwise the theorem is trivial. We 
begin by showing that f, can be extended onto a larger space L without 
violating the condition (4). Let z be any element of L — Lo, and let Z 
be the subspace generated by L, and the element z, i.e., the set of all linear 


combinations 
x+u4z (x € Ly). 


If fis to be an extension of f, onto Z, we must have 
Fx + t2) = folx) + the) 
F(x + t2) = fox) + te (5) 


after setting f(z) = c. We now choose ¢ such that the “‘majorization” 
condition f(x + tz) < p(x + 1z) is satisfied, ice., such that 


So(x) + te < p(x + #2). 


We can write this condition as 


or 


A +ecali) 
or 
e< (7 +) -4(7) (6) 
if t > 0, and as 
Hijre aia 
or 
oo-f-a)-46) 


if <0. Hence we want to show that there is always a value of c satisfying 
(6) and (7). Let y’ and y” be arbitrary elements of Ly. Then it follows 
from the inequality 


Soy") — fol’) < pQ” — ») = p(O”" +2) -— 0" +2) 
< p(y" +z) +p(-y’ — 2) 


that 
—fo") + pO" +2) > —foly’) — p(—y' — 2). (8) 
Let 
c= ve [—foly’) — p(—y’ — z)], 
c= i [—fol(y”) + p(y” + z))- 
Then 


ce" > c’, 
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by (8) and the fact that y’ and y” are arbitrary. Hence, choosing c such 
that 


uw 


ce > eae, 


we find that the functional f defined on I by the formula (5) satisfies the 
condition f(x) < p(x). Thus we have succeeded in showing that if So is 
defined on a subspace Ly © L and satisfies (4) on Lo, then fy can be 
extended onto a larger subspace L with the condition (4) being preserved. 

To complete the proof, suppose first that L is generated by a countable 
set of elements x1, X2,...,%,,.-.in L. Then we construct a functional 
on L by induction, i.e., by constructing a sequence of subspaces 


LY = {Lm}, L® = {L0,x},..., 


each contained in the next. Here {L, x,,,} denotes the minimal linear 
subspace of L containing L™ and x,,,. This process extends the 
functional onto the whole space L, since every element x € L belongs to 
some subspace Li‘), 

More generally, i.e., in the case where there is no countable set 
generating L, the theorem is proved by applying Zorn’s lemma (see 
p. 28). The set ¥ of all possible extensions of the functional fy satisfying 
the majorization condition (5) is partially ordered, and each /inearly 
ordered subset ¥, < F has an upper bound. This upper bound is the 
functional which is defined on the union of the domains of all functionals 
feF, and coincides with every such functional f on the domain of f- 
Hence, by Zorn’s lemma, ¥ has a maximal element f. Clearly f must be 
the desired functional extending fj onto L and satisfying the condition 
P(x) < f(x), since otherwise we could extend fin turn, by the method 
described above, from the proper subspace on which it is defined onto a 
large subspace, thereby contradicting the maximality of f. Jj 


Next we turn to the case of complex linear spaces: 


DEFINITION 3’. A functional p defined on a complex linear space L is 
said to be convex if 

1) p(x) > 0 for all x € L (nonnegativity); 

2) p(ax) = |a| p(x) for all x € L and all complex «; 

3) p(x + y) < p(x) + pQ) for all x, ye L. 


The corresponding complex version of the Hahn-Banach theorem is 
given by 
THEOREM 5’. Let p be a finite convex functional, defined on a complex 
linear space L, and let Ly be a subspace of L. Suppose fy is a linear 
functional on Ly Satisfying the condition 


| fo(x)| < p(x) (4/) 
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on Ly. Then fy can be extended to a linear functional on L satisfying (4’) 
on the whole space L. 


Proof. Let Lz and Loz denote the spaces L and Ly, regarded as real 
linear spaces. Clearly p is a finite convex functional on Ly, while 


Sor(x) = Re folx) 


is a real linear functional on Lop satisfying the condition 


\for(x)| < p(x) 


and hence (a fortiori) the condition 


Sor(*) < p(x). 


By Theorem 5, there exists a real linear functional fp defined on all of Lp, 
satisfying the conditions 


fr) < Pp) if xeLz (= L), 
Sr = for) if xe Log (= Ly). 


—fr(x) = fr(—*) < p(—x) = p(x), 


fax)| < p(x) if xe Lp (= L). (9) 
We now define the functional 


I (x) = fr(x) — ifg(ix) 


on L, using the fact that L is a complex.linear space in which multipli- 
cation by complex numbers is defined. It is easily verified that fis a com- 
plex linear functional on L such that 


fO)=fl%) if xELo, 
Re f(x) = fr) if xeL. 
Finally, to show that | f(x)| < p(x) for all x € L, suppose to the contrary 
that | f(xo)| > p(%o) for some x € L. Writing f(x») = pe’* where p > 0, 
we set yo = exo. Then 
frelYo) = Re f(y) = Re [ef (%0)] = 9 > p(%o) = Pr) 
which contradicts (9). 


Clearly 


and hence 


14.5. Separation of convex sets in a linear space. Given a real linear space 
L, let M and N be two subsets of L. Then a linear functional f defined on 
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L is said to separate M and N if there exists a number C such that 
fwe2ec if xeM, 
f(y<C if xen. 

It follows at once from this definition that 


1) A linear functional f separates two sets M and N if and only if it 
separates M — N = {z:z = x — y,xEM,y € N}and {0}, i-e., the set 
consisting of all differences x — y where x € M, ye N and the set 
whose only element is 0 (note that the minus sign in M — N does not 
have the usual meaning of a set difference); 

2) A linear functional f separates two sets M and N if and only if it 
separates the sets M — x9 = {z:z = x — Xo, x © M} and N— x) = 
{z:z = y — x, y EN} for every xy € L. 


The following theorem on the separation of convex sets in a linear space 
has numerous applications and is an easy consequence of the Hahn-Banach 
theorem: 


THEOREM 6. Let M and N be two disjoint convex sets in a real linear 
space L, where at least one of the sets, say M, has a nonempty interior 
(i.e., isa convex body). Then there exists a nontrivial linear functional f on 
L separating M and L. 


Proof. There is no loss of generality in assuming that the point 0 
belongs to the interior of M, since otherwise we need only consider the 
sets M — x) = {z:z = x — Xy, x € M} and N— x) = {z:z=y — Xp, 
y € N}, where x, is some point of the interior of M. Let yy be a point of 
N. Then the point —y, belongs to the interior of the set M— N= 
{z:z=x—y,xeM, y € N}, and 0 belongs to the interior of the set 
M~N+yo = {2:2 =x—y+yo,x€M, ye N}. Since M and N are 
disjoint, we haveO ¢ M — N, yo ¢.M — N + yo. Letp be the Minkowski 
functional for the set M — N+ yo. Then p(yo) > 1 since y ¢ M— N 
+ yo. Consider the linear functional 


Joao) = “P(Vo) 


defined on the one-dimensional subspace of L consisting of all elements 
of the form ayg. Clearly fp satisfies the condition 


So(“Vo) < P(ayo), 
since 
P(&Vo) = ap(o) if a>0, 
while 
Sol%Vo) = “fo(Yo) <9 < play) if «<0. 


Hence, by the Hahn-Banach theorem, the functional f, can be extended 
to a linear functional f defined on the whole space L and satisfying the 
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condition f(y) < p(y) on L. It follows that f(y) < lifyeM—N+ yo, 
while at the same time f(yo) > 1, i.e., f separates the sets M — N + yo 
and {yo}. Therefore f separates the sets M — N and {0}. But then f 
separates the sets Mand N. §j 


Problem 1. Let M be the set of all points x = (x1, X2,...,X,,---) ink, 
satisfying the condition 
Dd nxt <1. 
n=1 


Prove that M is a convex set, but not a convex body. 


Problem 2. Give an example of two convex bodies whose intersection is 
not a convex body. 


Problem 3. We say that n + 1 points x, x2,..., X_4, in a linear space L 
are “in general position’ if they do not belong to any (n — 1)-dimensional 
subspace of L. The convex hull of a set of n + 1 points x4, x2,... , Xa41 in 
general position is called an n-dimensional simplex, and the points x1, X2,... 
Xn41 themselves are called the vertices of the simplex. Describe the zero- 
dimensional, one-dimensional, two-dimensional and _ three-dimensional 
simplexes in Euclidean three-space R*. Prove that the simplex with vertices 


X41, Xg,-++ Xp 411s the set of all points in L which can be represented in the 
form 
ntl 
x= > OX ies 
k=1 
where 
ntl 


Xp > 0, Pe = 1. 


Problem 4. Show that if the points x,, X2,... , X,41 are in general position, 
then so are any k + 1 (k <n) of them. 


Comment. Hence the k +1 points generate a k-dimensional simplex, 
called a k-dimensional face of the n-dimensional simplex with vertices x,, 
Xess Xa 

Problem 5. Describe all zero-dimensional, one-dimensional and two- 
dimensional faces of the tetrahedron in R® with vertices e,, @., €3, 4. 


Problem 6. Show that in the Hahn-Banach theorem we can drop the 
condition that the functional p be finite. 


15. Normed Linear Spaces 


15.1. Definitions and examples. Chapters 2 and 3 deal with topological 
(in particular, metric) spaces, i.e., spaces equipped with the notion of 
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closeness of elements, while Secs. 14 and 15 deal with linear spaces, i.e., 
spaces equipped with the operations of addition of elements and multipli- 
cation of elements by numbers. We now combine these two ideas, arriving at 
the notion of a topological linear space, equipped with a topology as well 
as with the algebraic operations characterizing a linear space. In this section 
and the next, we will study topological linear spaces of a particularly 
important type, namely normed linear spaces and Euclidean spaces. Topo- 
logical linear spaces in general will be considered in Sec. 17. 


DEFINITION 1. A functional p defined on a linear space L is said to be 
a norm (in L) if it has the following properties: 

a) p is finite and convex; 

b) p(x) = 0 only if x = 0; 

c) p(ax) = |a| p(x) for all x € Land all «. 


Recalling the definition of a convex functional, we see that a norm in 
L is a finite functional on L such that 


1) p(x) > O for all x € L, where p(x) = O if and only if x = 0; 
2) p(ax) = |@| p(x) for all x € L and all «; 
3) p(x + y) < p(x) + py) for all x, y EL. 


DEFINITION 2. A linear space L, equipped with a norm p(x) = |x|, is 
called a normed linear space. 


The notation ||x|| will henceforth be preferred for the norm of the element 
x € L. In terms of this notation, properties 1)—3) take the form: 


1’) |x]| > O for all x € L, where ||x|| = 0 if and only if x = 0; 
2’) |laxl| = lo] |x|} for all x eZ and all «; 
3’) Triangle inequality: |x + yl] < |xl| + lyll for all x, y EL. 


Every normed linear space L becomes a metric space if we set 


e(x, y) = Ix — yll (1) 


for arbitrary x, y ¢ L. The fact that (1) is a metric follows at once from 
properties 1’)-3’). Thus everything said about metric spaces in Chap. 2 
carries over to the case of normed linear spaces. 

Many of the spaces considered in Chap. 2 as examples of metric spaces 
(or in Sec. 13 as examples of linear spaces) can be made into normed linear 
spaces in a natural way, as shown by the following examples (in each case, 
verify that the norm has all the required properties): 





® One of the pioneer workers in this field was Stefan Banach (1892-1945), author of 
the classic Théorie des Opérations Linéaires, reprinted by Chelsea Publishing Co., New 
York (1955). 
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Example 1. The real line R! becomes a normed linear space if we set 
|x|] = |x| for every number x € R?. 


Example 2. To make real n-space R® into a normed linear space, we set 


n 
I|x|] = [38 
k=1 


for every element x = (x,, %2,...,x,) in R”. The formula 





e(x, y) = |x — yl = [Ses — Ye) 
then defines the same metric in R” as already considered in Example 3, p. 38. 


Example 3. We can also equip real n-space with the norm 


Ills = 2 be (2) 
or the norm 7 
|< |lo = max |x,|. (3) 
1<kSn 


The corresponding metrics lead to the spaces R? and R® considered in Ex- 
amples 4 and 5, p. 39. 


Example 4. The formula 





x|| = | Sa? 

Ix =f Shaul 
introduces a norm in complex n-space C”. Other possible norms in C” are 
given by (2) and (3). 


Example 5. The space C,, ,, of all functions continuous on the interval 
[a, b] can be equipped with the norm 


lf = max FQ) . 


The metric space corresponding to this norm has already been considered in 
Example 6, p. 39. 


Example 6. Let m be the space of all bounded numerical sequences 


He (May ask OS RS 
and let 
xl] = sup xz. (4) 


Then (4) obviously has all the properties of a norm. The metric ‘‘induced”’ 
by this norm is the same as that considered in Example 9, p. 41. 
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Example 7. A complete normed linear space, relative to the metric (1), is 
called a Banach space. \t is easy to see that the spaces in Examples 1-6 are 
all Banach spaces (the details are left as an exercise). 


15.2. Subspaces of a normed linear space. In Sec. 13.3 we defined a 
subspace of a linear space L (unequipped with any topology) as a nonempty 
set Ly with the property that if x, ye Lo, then ax + By € L, for arbitrary « 
and 6. The subspaces of greatest interest in a normed linear space are the 
closed subspaces, i.e., those containing all their limit points. In the case of an 
infinite-dimensional space, it is easy to give examples of subspaces that are 
not closed:° 


Example 1. In the space of all bounded sequences, the sequences with 
only finitely many nonzero terms form a subspace, but not a closed subspace, 
since, for example, the closure of the subspace contains the sequence 


Example 2. The set P,,,,, of all polynomials defined on the interval [a, b] 
is a subspace of C,,,,;, but obviously not a closed subspace. On the other 
hand, the closure of Pi,,,; coincides with C;,,,;, since every function con- 
tinuous on [a, 5] is the limit of a uniformly convergent sequence of poly- 
nomials, by Weierstrass’ approximation theorem.’® 


In what follows, we will be concerned as a rule with closed subspaces. 
Hence it is natural to modify somewhat the terminology adopted in Sec. 13.3, 
ie., by a subspace of a normed normed linear space we will always mean a 
closed subspace. In particular, by the subspace generated by a set of elements 
{x,} we will always mean the smallest closed subspace containing {x,}. This 
subspace will also be called the Jinear closure of {x,}. The term linear manifold 
will be reserved for a set of elements L, (not necessarily closed) such that 
x,y €Ly implies ax + By € Ly for arbitrary numbers « and 8. A set of 
elements {x,} in a normed linear space L is said to be complete (in L) if the 
linear closure of {x,} coincides with L. 


Remark. This is another meaning of the word “‘closed,”’ not to be confused 
with its meaning in Sec. 6.4. The context will always make it clear which 
meaning is intended. 


Example 3. By Weierstrass’ approximation theorem, the set of functions 
1,1, 07,...,¢",...is complete in C,, ,). 


® This contingency cannot arise in a finite-dimensional subspace (see Problem 5a). 
1 See e.g.,G. P. Tolstov, Fourier Series (translated by R. A. Silverman), Prentice-Hall, 
Inc., Englewood Cliffs, N.J. (1962), p. 120. 
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Problem 1. A subset M of a normed linear space R is said to be bounded 
if there is a constant C such that ||x|| < C for all x € M. Reconcile this with 
Problem 5, p. 65. 


Problem 2. Given a Banach space R, let {B,} be a nested sequence of 
closed spheres in R (so that B, > B, > ---> B, >>--). Prove that (f) B,, 


is nonempty (it is not assumed that the radius of B, approaches 0 asn ares): 
Give an example of a nested sequence {£,} of nonempty closed bounded 
convex sets in a Banach space R such that () E, is empty (cf. Problem 6, 
p. 66). - 


Problem 3. Prove that the algebraic dimension (defined in Problem 4c, 
p- 128) of an infinite-dimensional Banach space is uncountable. 


Problem 4, Let R be a Banach space, and let M be a closed subspace of R. 
Define a norm in the factor space P = R/M by setting 


|&l| = inf ||| 
ree 


for every element (residue class) § €¢ P. Prove that 
a) ||&|| is actually a norm in P; 
b) The space P, equipped with this norm, is a Banach space. 


Problem 5. Let R be a normed linear space. Prove that 
a) Every finite-dimensional linear subspace of R is closed; 
b) If M is a closed subspace of R and N a finite-dimensional subspace 
of R, then the set 
M+N=({z:z=x+y,xEM,yeN} (5) 
is a closed subspace of R; 


c) If Q is an open convex set in R and x, ¢ Q, then there exists a closed 
hyperplane which passes through the point x, and does not intersect Q. 


Problem 6. Let x = (x1, %2,...,X,,-.-) be an arbitrary element of . 
Prove that /, is a normed linear space when equipped with the norm 


= 2 
lll =f Sak. 
k=1 


Give an example of two closed linear subspaces M and N of /, whose “linear 
sum’? M + N is not closed. 





Problem 7. Two norms ||*|l1, ||*llg in a linear space R are said to be 
equivalent if there exist constants a, b > 0 such that 
a [XI], < lle < 5 Halls 
for all x € R. Prove that if R is finite-dimensional, then any two norms in 
R are equivalent. 
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16. Euclidean Spaces 


16.1. Scalar products, Orthogonality and bases. We begin with two key 
definitions: 

DEFINITION 1. By a scalar product in a real linear space R is meant a 
real function defined for every pair of elements x, y € R and denoted by 
(x, y), with the following properties: 

1) (x, x) > O where (x, x) = 0 if and only if x = 0; 

2) (&, y) = Q, x); 

3) (Ax, y) A(x, y)5 

4 @y+2= (x,y) + (x, z) 

(valid for all x, y, z € R and all real 2). 

DEFINITION 2. A linear space R equipped witha scalar product is called 
a Euclidean space. 

LEMMA. Any two elements x, y of a Euclidean space R satisfy the 
Schwarz inequality 

I, yl < [ail My (1) 


where 
Ixi=VG@,2), Il =V0,9). 
Proof. The quadratic polynomial 
PA) = Ax + y, Ax + y) = M(x, x) + 2x, Y) + YY) 
= [lx]? + 20%, ya + ly? 

is obviously nonnegative. Therefore 

(x, y)? — |x|? Ilyll? < 0, (2) 
since otherwise ~(A) would become negative for some A (why ?). But (2) 
is equivalent to (1). 

We now use the scalar product in a Euclidean space R to introduce a 
norm in R: 

THEOREM 1. A Euclidean space R becomes a normed linear space when 

equipped with the norm 
In] =V@x) eR). 

Proof. Properties 1’) and 2’) on p. 138 are immediate consequences 
of the definition of a scalar product. To prove property 3’), ie., the 
triangle inequality, we note that 

Ix + yl? = (& +y,% + y) = (x, x) +20, 9) + OY) 


< (x, x) +2 |(x, y)] +, y) 
< [ll]? +2 lxtl yl + yl? = Cail + iy), 


SEC. 16 EUCLIDEAN spaces 143 


because of the Schwarz inequality (1), and hence 


Ix +yll < lx + ly. o 


The scalar product in R can be used to define the angle between two 
vectors as well as the length (i.e., norm) of a vector: 


DEFINITION 3. Given any two vectors x and y in a Euclidean space R, 

the quantity 8 defined by the formula 
= (x, y) 

ll Uyll 


is called the angle between x and y. 


O<6< 7) (3) 


Remark. It follows from Schwarz’s inequality (1) that the right-hand 
side of (3) cannot exceed 1. Therefore, given any x and y, (3) actually 
determines a unique angle in the interval [0, 7]. 


Suppose (x, y) = 0, so that (3) implies 6 == m/2. Then the vectors x and y 
are said to be orthogonal. A set of nonzero vectors {x,} in R is said to be 
an orthogonal system if 


(Xa, Xp) = 0 for «46 
and an orthonormal system if 
0 for «#8, 
(Xa Xe) = 1 


for «=8. 


If {x,} is an orthogonal system, then clearly 
ls 
Ill 


THEOREM 2. The vectors in an orthogonal system {x,} are linearly 
independent. 


is an orthonormal system. 


Proof. Suppose 
CyXay + CoXa_ °° + CnX%q, = 0. 
Then, taking the scalar product with x,,, we get 
(%aj2 (1X0, + CaXag FH * + CnXan) = Ci(%a,2 Xa,) = 0, 
by the orthogonality of {x,}. But (x,,, X2,) #0, and hence 
q¢=0 (G=1,2,...,n).. 


An orthogonal system {x,} is called an orthogonal basis if it is complete, 
i.e., if the smallest closed subspace containing {x,} is the whole space R. 
Similarly, a complete orthonormal system is called an orthonormal basis. 
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16.2. Examples. We now give some examples of Euclidean spaces and 
orthogonal bases in them: 


Example 1. Let R” be real n-space, i.e., the set of all ordered n-tuples 


X = (%1, X2,--- 5 Xn) Y= Vo Yar-++>Vn)o+++> equipped with the same 
algebraic operations as in Example 2, p. 119. Using the formula 


(x, y) = 3 ae (4) 


to define a scalar product in R”, we get Euclidean n-space. The corre- 
sponding norm and distance in R” are 


n 
Ill = i > 
k=1 


(x, y) = lx — yl = [3c = ae (5) 





and 


The vectors 


Cr 


form an orthonormal basis in R”, one of infinitely many such bases. 


Example 2. The space /, with elements x = (1, X2,...,X%y-- +), P= 
(V1 Yass ++ > Vav+++)o +++» Where 


eo 2 co a 
yxy < ©, Dp < ,..., 
k=1 k=1 


becomes an infinite-dimensional Euclidean space when equipped with the 
scalar product 
(x, y) = 2 %eVe (6) 


The convergence of the right-hand side of (6) follows from the elementary 
inequality 

IxeVal < (xel + yal)? < 20% + Yas 
and it is an easy matter to verify that (6) has all the properties of a scalar 


11 The term ‘‘Euclidean n-space’’ has already been used in Example 3, p. 38 to describe 
the metric space with distance (5). In so doing, we anticipated the eventual introduction of 
the scalar product (4). 
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product. The simplest orthonormal basis in /, consists of the vectors 


e; = (1,0,0,...), 
é, = (0, 1,0,...), 
e, = (0, 0,1,...), @ 


eee ee we ew ww 


The orthonormality of the system (7) is obvious. As for the completeness 
of the system, given any vector x = (X1, X2,...,X,,...) in h, let 


x) = (X41, Xo,-4- > Xp, 0,0,...). 


Then x‘) is a linear combination of the vectors e,, é,...,@, and 
|x — x|| >O0ask > o. 


Example 3. The space Cf, ,, consisting of all continuous functions on 
[a, 5] equipped with the scalar product 


(fe) =['fOeo at 


is another example of a Euclidean space. Among the various orthogonal 
bases in Cf, 4), one of the most important is the system of trigonometric 
functions 

2rnt .. 2tent 
s———, sin —— 


1, co . 
b-—a b-—a 


GSI): (8) 


The orthogonality of this system can be verified by a simple calculation. 
Making the choice a = —n, b = x, we simplify (8) to 


1,  cosnt, sinnt (n=1,2,...). (8’) 


Thus (8’) is an orthogonal basis in the space C?_,, ,). As for the completeness, 
we have 


THEOREM 3. The system (8) is complete in C7, ,,. 


Proof. By another version of Weierstrass’ approximation theorem,” 
every function ¢ continuous on the interval [a, b] and such that (a) = 
p(b) is the limit of a uniformly convergent sequence of trigonometric 
polynomials, i.e., linear combinations of elements of the system (8). 

. This sequence converges (a fortiori) to 7 in the norm of the space C7, »). 
But an arbitrary function fe C?, ,, can be represented as the limit in the 


12 See e.g., G. P. Tolstov, op. cit., Corollary 1, p. 117. 
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C?,,») norm of a sequence of functions {9,,}, where 


f@® ae ee ee 

Pn(X) = 4 

[(s = ‘) = nga) | =Hife). I p= tenes 
n n 


coincides with f in the interval 

[a, b — (1/n)], is linear on [b — (1/n), b] 

and takes the same value at the point 

\ b as at the point a (see Figure 16). 

\ Hence every element of C?,,) can 

be approximated arbitrarily closely 

+_____________l | (in the C?,,; norm) by a linear 

combination of elements of the system 
Ficure 16 (8). §f 


16.3. Existence of an orthogonal basis. Orthogonalization. From now on, 
we will be mainly concerned with the case of separable Euclidean spaces, 
ie., Euclidean spaces containing a countable everywhere dense subset. For 
example, the spaces R”, /, and C7, ,) are all separable, as shown in Sec. 6.3. 
An example of a nonseparable Euclidean space is given in Problem 2. 


THEOREM 4. Every orthogonal system {x,} in a separable Euclidean 
space R has no more than countably many elements x,. > 


Proof. There is no loss of generality in assuming that the system 
{x,} is orthonormal as well as orthogonal, since otherwise we need only 


replace {x,} by 
Xo 
ie 


Ix,—xgl=V2 if «AB. (9) 


Consider the set of open spheres S(x,, 4). These spheres are pairwise 
disjoint, because of (9). Moreover, each sphere contains at least one 
element from some countable subset {y,} everywhere dense in R. Conse- 
quently there are no more than countably many such spheres, and hence 
no more than countably many elements x,. J 





We then have 


We have already exhibited an orthogonal basis in each of the spaces R”, 
I, and Cf, 4}. The existence of an orthogonal basis in any separable Euclidean 
space is guaranteed by the following theorem and its corollary, analogous 
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to the theorem on the existence of an orthogonal basis in any finite-dimensional 
Euclidean space :18 


THEOREM 5 (Orthogonalization theorem). Let 


Sistas ks efrsotes (10) 
be any (countable) set of linearly independent elements of a Euclidean 


space R. Then R contains a set of elements 


Pis Vax vs 5 Ogee es (11) 
such that 


1) The system (11) is orthonormal; 
2) Every element o,, is a linear combination 


Gn = aahi at Gnede Sa Gantn (Gan # 0) 


of the elements fi, fos» +5 fn3 
3) Every element f,, is a linear combination 


Sn = On + Bnepe +°°* + BanPn (Onn # 9) 
of the elements $1, 92,--- 5 Pn 
Moreover, every element of the system (10) is uniquely determined by these 
conditions to within a factor of +1. 


Proof. First we construct 9,. Setting 


1 = anh, 


we determine a,, from the condition 
($1, 93) oar aff) = 1, 
1 1 


oe 


bn VA 


This obviously determines 9, uniquely (except for sign). 

Next suppose elements 9), $2,..., P,-1 Satisfying the conditions of 
the theorem have already been constructed. Then /, can be written in the 
form 


which implies 





Ti = bar Pr + aed a Da n—1Pn-1 + has (12) 
where 
(hns Px) =O (Kk =1,2,...,n—1). 


13 See e.g., G. E. Shilov, An Introduction to the Theory of Linear Spaces (translated by 
R. A. Silverman), Prentice-Hall, Inc., Englewood Cliffs, N.J. (1961), Theorem 28, p. 142. 
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In fact, the coefficients b,,, and hence the element fA, are uniquely 
determined by the conditions 
(hans 9x) = Sn — Oni P1 — °° * — On n—1Pn—1> Ped 
a fu Px) —2 Dar (Pr Px) = 0, 


Dan = Sar On) (k= 1,2,...,n—1). 
Clearly (h,,h,) > 0, since (hy, h,) = 0 contradicts the assumed linear 
independence of the elements (10). Let 
enon eae 
V (igs tha) 
Using (12) and (13), we express /, and hence ¢,, in terms of the functions 
Pisfes 0s fay BCs, : 
G2 = anh + aye te sets + Anntrs 


i.e., 


(13) 


where 
Qnn = as #0. 
Vn hn) 

Moreover 

(n> Px) = 0 (k =1,2,...,2—1), 

(Pn> Pn) =1 
and 

J = bai 1 + brePe +°°° + Ban Pn 

where 


Ban = V (ns Ay) > 0. 


Thus, starting from elements 9, 2,..., 9,1 Satisfying the conditions 
of the theorem, we have constructed elements 9, 92,..., Qn—1s Pn 
satisfying the same conditions. The proof now follows by mathematical 
induction. § 


Remark. The process leading from the linearly independent elements (10) 
to the orthonormal system (11) is called orthogonalization. It is clear that 
the subspace generated by (10) coincides with that generated by (11). 
Hence the set (10) is complete if and only if the set (11) is complete. 


COROLLARY. Every separable Euclidean space R has a countable 
orthonormal basis. 


Proof. Let 1, v,,.-.5Un,--. be a countable everywhere dense 
subset of R. Then a complete set of linearly independent elements fi, 
Ses» ++ sfns ++. can be selected from {,}. In fact, we need only eliminate 
from the sequence {),,} all elements |, which can be written as linear 


SEC, 16 EUCLIDEAN Spaces 149 


combinations of elements ); with smaller indices (i < k). Applying the 
orthogonalization process to fy, fo,.-+>fn»+--, We get an orthonormal 
basis. 


16.4. Bessel’s inequality. Closed orthogonal systems. Let ¢@,, és,..., 
be an orthonormal basis in R”. Then every vector x € R” can be written in 


the form 
n 


x= > CHueKs 
k=1 
where 
Cy = (X, @,)- 


We now show how this generalizes to the case of an infinite-dimensional 
Euclidean space R. Let 9, 2,..., 9,,... be an orthonormal system in 
R, and let f be an arbitrary element of R. Suppose that with f we associate 


1) The sequence of numbers 
Ce= (fi) (k= 1,2,...), (14) 
called the components or Fourier coefficients of f with respect to the 


system {9x}; 
2) The series 


Pr (15) 
k=1 
(for the time being, purely formal), called the Fourier series of f with 
respect to the system {¢,}. 


Then it is natural to ask whether the series (15) converges,“ and if so, 
whether the sum of the series coincides with the original function f. To 
answer these questions, we first prove 


THEOREM 6. Given an orthonormal system 


Dy, Por. + + > Paye-s (16) 


in a Euclidean space R, let f be an arbitrary element of R. Then the 
expression 


Iy- Y 4 Pe 
k=1 








achieves its minimum for 


ay = Ce = (Si Px) (A =1,2,...,7). 


14 More exactly, whether the sequence of partial sums corresponding to (15) converges 
in the metric of R. 


150 LINEAR SPACES CHAP. 4 


This minimum equals 


It — Se 
Moreover 
La< isl a7) 


a result known as Bessel’s inequality. 


Proof. Let 
Sn = 2 Me Pre (18) 


Then, by the orthonormality of (16), 
If Syl? = (f— Sarees —Z anes] 
k=l k=1 
ae fo Ea 2, Saver) ae (Se ry 
k=1 k=1 t=1 


= Ifl*— 23 oxen + Sat 
or ‘ 7 
If— Sal? = If? Sek + F(a — en (19) 


where 
oc=(Ao) (&=1,2,...,0). 


The expression in the right-hand side of (19) obviously achieves its mini- 
mum when its last term vanishes, i.e., when 


ay = Cy (e=:1,.250.% 57), 


and this minimum is just 


If— Sal? = Wf? ~— de (20) 
Moreover, since || f — S,,||? > 0, it follows from (20) that 
px: < If tl? (21) 
for every n. Hence the series 
2 
k=1 


is convergent. Taking the limit as n — 00 in (21), we get (17). Bf 
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Remark. Geometrically, Bessel’s inequality (17) means that the sum of 
the squares of the projections of a vector fonto a set of mutually perpendicular 
directions cannot exceed the square of the length of the vector itself. For a 
geometric interpretation of the rest of Theorem 6, see Problems 5 and 6. 


The case where Bessel’s inequality becomes an equality is particularly 
important: 


DEFINITION 4. Suppose equality holds in (17) for every fe R, ie., 
Sup pose 


oO 
Xe = fll’ (22) 
k= 

for every fe R. Then the orthonormal system 1, 92)+ ++» Qxs++ + is said 


to be closed. 


Remark. This is another meaning of the word “‘closed,” not to be 
confused with its meaning in Sec. 6.4. The context will always make it 
clear which meaning is intended. Formula (22) is known as Parseval’s 
theorem. 


THEOREM 7. An orthonormal system 1 92,.-.., Qy,..- ina Euclidean 
space R is closed if and only if every element f € R is the sum of its Fourier 
series. 


Proof. According to Definition 4, R is closed if and only if (22) holds 
for every fe R. Taking the limit as n — oo in (20) and using (18), we see 
that (22) holds for every fe R if and only if 

n 
lim || f — > c, 9; || = 0, 
k=1 


NID 














or equivalently 


00 
f= Xeu%s 
foreveryfeER. 


The properties of being complete and being closed are intimately connected, 
as shown by 


THEOREM 8. An orthonormal system 9, 92,..., Py... ina Euclidean 
space R is complete if and only if it is closed. 


Proof. Suppose {¢,} is closed. Then, by Theorem 7, every element 
J € Ris the limit of the partial sums of its Fourier series. In other words, 
linear combinations of elements of {9,} are everywhere dense in R, 
i.e., {p,} is complete. 
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Conversely, suppose {¢,} is complete. Then every element f€ R can 
be approximated arbitrarily closely by a linear combination 


n 
= ay Px 
kal 


of elements of {9}. But the partial sum 


n 
> CKPx 
k=l 


of the Fourier series of fis at least as good an approximation. Hence f 
is the sum of its own Fourier series. It follows from Theorem 7 that 
{p,} is closed. 


CoroLLary. Every separable Euclidean space R contains a closed 
orthonormal system 1, Pas. ++» Pes +++ 


Proof. An immediate consequence of Theorem 8 and the corollary 
to Theorem 5. § 
Example 1. The orthonormal] system (7) is closed in 4. 


Remark. In introducing the concepts of Fourier coefficients and Fourier 
series, we assumed that the system {¢,} is orthonormal. More generally, 
suppose {¢,} is orthogonal but not orthonormal, and let 


Then the system {t,} is orthonormal. Given any fe R, let 


1 
Cy = (Cf, by) = — Sf Pr)» 
and consider the series 





foe) foo} Cc foo} 
Dabs => —* oe = Dae 
a Ete 
where 
— oe (4 Px) (23) 


Weel) Hegel?” 


Then the coefficients (23) are called the Fourier coefficients of the element 
SER with respect to the orthogonal (but not orthonormal) system {¢,}. 
Substituting c, = a; ||9,|| into (17), we get the following version of Bessel’s 
inequality for arbitrary orthogonal systems: 


Sat llonl < FF. 17’ 


SEC. 16 EUCLIDEAN spaces 153 


If equality holds in (17’) for every fe R, the orthogonal system {¢,} is said 
to be closed, just as in Definition 4. 


Example 2. The orthogonal system (8) is closed in C?, ,,. 


16.5. Complete Euclidean spaces. The Riesz-Fischer theorem. Given a 
Euclidean space R, let {~,} be an orthonormal (but not necessarily complete) 
system in R. It follows from Bessel’s inequality that a necessary condition 
for the numbers c,, Co,...,Cz,.-. to be Fourier coefficients of an element 
f€R is that the series 


oe 
> % 
k=l 


converge. It turns out that this condition is also sufficient if R is complete, 
as shown by 


THEOREM 9 (Riesz-Fischer). Given an orthonormal system {,} in a 
complete Euclidean space R, let the numbers cy, C2,..., Cy, ... be such 
that 


we (24) 


converges. Then there exists an element fe R with cy, Ce, ... Cys». . aS 
its Fourier coefficients, i.e., such that 
fe a} 


Se = If? 


k= 


c= Ae) &=1,2,...). 
Proof. Writing 


where 


n 
re = > Ce Pr» 
k=1 
we have 


n+p 

2 

Wren — fall’ = lont1Pn41 + cra + re ae 3 Cre 
=n+1 


Hence f converges to some element fe R, by the convergence of (24) 
and the completeness of R. Moreover, 


(Ff; Px) = Pn» Pe) + (F—Sfnr Px)» (25) 


where the first term on the right equals c, ifm > k and the second term 
approaches zero as n — 00, since 


If — fn» P01 < IF —Sall lee. 
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Taking the limit as n — 00 in (25), we get 
Sh, Px) ie Crs 


since the left-hand side is independent of n. Moreover, 


If — fall > 0 


as n — oo, and hence 


(s-Zervef Zorn) = 4 —e +0 
k=1 k=1 k=1 
asn— oO, i.e., 
lim Ye = SR= Ns. | 
n7o k=1 k=1 
THEOREM 10. Let {¢,} be an orthonormal system in a complete Eu- 


clidean space. Then {¢,} is complete if and only if R contains no nonzero 
element orthogonal to all the elements of {¢,}. 


Proof. Suppose {¢9,} is complete and hence closed (by Theorem 8), 
and suppose f is orthogonal to all the elements of {9,}. Then all the 
Fourier coefficients of f vanish. Hence 


If? = ez = 0 
k=1 


by the Riesz-Fischer theorem, i.e., f = 0. 
Conversely, suppose {¢,} is not complete. Then R contains an 
element g 4 0 such that 


fe a} 
IIgll > 2% where c, = (g, 9) 


(why ?). By the Riesz-Fischer theorem, there exists an element fe R 
such that 


a= Ie -> c. 


But f — g is orthogonal to all the 9,, by construction. Moreover, it 
follows from 


eo 
Wr = 2% < Iigll? 
that f—gA~0. | 
16.6. Hilbert space. The isomorphism theorem. Continuing our study of 
complete Euclidean spaces, we concentrate our attention on infinite- 


dimensional spaces, since finite-dimensional spaces are considered in great 
detail in courses on linear algebra. 


SEC, 16 EUCLIDEAN SPACES 155 


DEFINITION 5. By a Hilbert space’ is meant a Euclidean space which 
is complete, separable and infinite-dimensional. 


In other words, a Hilbert space is a set H of elements f, g,... of any 
kind such that 


1) H is a Euclidean space, i.e., a real linear space!® equipped with a 
scalar product; 

2) H is complete with respect to the metric e(f, g) = If — gl; 

3) H is separable, i.e., H contains a countable everywhere dense subset; 

4) H is infinite-dimensional, i.e., given any positive integer n, H contains 
n linearly independent elements. 


Example. The real space /, is a Hilbert space (check all the properties). 


DEFINITION 6. Two Euclidean spaces R and R* are said to be isomor- 
phic (to each other) if there is a one-to-one correspondence x<> x*, ye y* 
between the elements of R and those of R* (x, y¢€R, x*, y* € R*) 
preserving linear operations and scalar products in the sense that” 


xtycox* ty", axerax*, (x,y) = (x*, y*). 


It is well known that any two n-dimensional Euclidean spaces are iso- 
morphic to each other, and in particular that every n-dimensional Euclidean 
space is isomorphic to the space R” of Example 1, p. 144.1* On the other 
hand, two infinite-dimensional Euclidean spaces need not be isomorphic. 
For example, the spaces /, and C7, », are not isomorphic, as can be seen from 
the fact that /, is complete while C?,,,) is not (recall Examples 4 and 5, 
p. 57). Nevertheless, for Hilbert spaces we have 


THEOREM 11 (Isomorphism theorem). Any two Hilbert spaces are 
isomorphic. 


Proof. The theorem will be proved once we manage to show that 
every Hilbert space H is isomorphic to J). Let {p,} be any complete 
orthonormal system in H (such exists by the corollary to Theorem 5), 
and with every element f¢ H associate its Fourier coefficients {c,} with 
respect to {p,}. Since 

Y § < 0, 
k=l 


15 Named after the celebrated German mathematician David Hilbert (1862-1943). 

16 However, see Sec. 16.9. 

17Jsomorphism of two normed linear spaces R and R* is defined in the same way, 
except that preservation of scalar products is replaced by preservation of norms, i.e., by 
the condition ||x {| = ||x*||. 

18 See e.g., G. E. Shilov, op. cit., Theorem 29, p. 144. 
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by Theorem 8, the sequence (c,, C,...,Cz,...) belongs to /,. Con- 
versely, by the Riesz-Fischer theorem, to every element (c,, ¢s,..., 
Cy,.--) in 2 there corresponds an element fe H with the numbers c,, 
Co,.++5 Czy... aS its Fourier coefficients. This correspondence between 
the elements of H and those of /, is obviously one-to-one. Moreover, if 


SP (C4; Coy oo os Cay ee )> 


I Gy Gay oy Gyre ds 
then clearly 


ft+fE (er + &, co + Ge + hye) 
af (001, Cg, ...  HCy,..-), 


i.e., sums go into sums and scalar multiples into scalar multiples with the 
same factor. Finally, by Parseval’s theorem, 


GN= le CN=Lé 
Gf) + UP AN+ EN =F+h +1 = Dx +a) 


@ 2 fea) feo) . 
= Yee + 2Veree + Vee 
k=1 k=1 k=1 
and hence 


foe) 
Sf f ) = > Cees 
k=1 
so that scalar products are preserved. 


Remark. Theorem 11 shows that to within an isomorphism, there is 
only one Hilbert space (i.e., only one space with the four properties listed 
above, and that this space has /, as its “coordinate realization,” just as 
the space of all ordered n-tuples of real numbers with the scalar product 
n 
> x,y is the “coordinate realization’ of axiomatically defined Euclidean 


k=1 
n-Space. 


16.7. Subspaces. Orthogonal complements and direct sums. In keeping 
with the terminology of Sec. 15.2, by a linear manifold in a Hilbert space H 
we mean a set L of elements of H such that f, g ¢ H implies af + 6g € L for 
arbitrary numbers « and 8, while by a subspace of H we mean a closed linear 
manifold in H. 


Lemma. If a metric space R has a countable everywhere dense subset, 
then so does every subset R’ © R. 
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Proof. Let 
Bis bass ety Gases 


be a countable everywhere dense subset of R, and let 
a, = inf (Ens 9). 
neR 


Then, given any positive integers n and p, there is a point y,, € R’ such 
that 


1 
(En, Nnp) < ay + Pp 


Given any « > 0 and any ne R’, let 


and choose n such that 


(En 7) < 5 


Then 
1 oe, «2d 
nw Nnp) <A, t-<e- tr Set, 
(Ens Mn») <n ; ae tase 
and hence p(y, Yap) < €. In other words, R’ has an everywhere dense 
subset {yp} (#, p = 1, 2,...) containing no more than countably many 
elements. § 


THEOREM 12. Every subspace M of a Hilbert space H is either a (com- 
plete separable) Euclidean space or itself a Hilbert space. Moreover, M 
has an orthonormal basis, like H itself. 


Proof. The fact that M has properties 1) and 2) of Definition 5 is. 
obvious. The separability of M follows from the lemma. To construct an 
orthonormal basis in M, apply Theorem 5 to any countable everywhere 
dense subset of M@. § 


Subspaces of a Hilbert space H have certain special properties (not shared 
by subspaces of an arbitrary normed linear space), stemming from the 
presence of a scalar product in H and the associated concept of orthogonality: 


THEOREM 13. Let M be a subspace of a Hilbert space H, and let 
M’'=HOM 


denote the orthogonal complement of M, i.e., the set of all elements h' € H 
orthogonal to every he M. Then M’ is also a subspace of H. 
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Proof. The linearity of M’ is obvious, since 
_ (hj, h) = (hi, h) = 0 
implies 
(ahy + why, h) = 0 
for arbitrary numbers «, and a. To show that M’ is closed, suppose 


{hi} is a sequence of elements of M4’ converting to h’. Then, given any 
heM, 


(h’, h) = lim (hj, h) = 0, 
and hence h’c M’. §f 7. 
Remark. By definition, h' < M’ if and only if h’ is orthogonal to every 
heM. But then € H if and only if h is orthogonal to every h’ ¢ M’. Hence 


M’ = H © M implies M = H © M’, and we can call M and M’ (mutually) 
orthogonal subspaces of H. 


THEOREM 14. Let M be a subspace of a Hilbert space H, and let 
MM’ = H © M be the orthogonal complement of M. Then every element 
Sf © H has a unique representation of the form 


f=h+n, (26) 
wherehe M,h' eM’. 


Proof. Given any fé H, let {¢,} be an orthonormal basis in M, and 
let 


h = 2 Ce Ps Ce =F; $4): 
By Bessel’s inequality, 
Lee < 0, 
k=1 


and hence, by the Riesz-Fischer theorem, / exists and belongs to M. 
Let 


h'=f—h. 
Then obviously 
(A', @) = 0 


for all k, and since any element g € M can be represented in the form 


feo} 
gs= > BPrs 
k=1 
we have 


(h', g) = auth, Px) = 0, 


ie., 4’ € M’. This proves the existence of the representation (26). 
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To prove the uniqueness of (26), suppose there is another represen- 
tation 
f —- hy + hi, 
where fh, € M, hie M’. Then 


(hh, oD = h Pu) = Cy 
for all k, and hence 


h=h, hi=h’. 


CoROLLARY 1. Every orthonormal system {,} in a Hilbert space H 
can be enlarged to give a complete orthonormal system in H. 


Proof. Let M be the linear closure of {¢,}, so that {¢,,} is complete 
in M. Let M’ = H © M be the orthogonal complement of M, and let 
{9} be a complete orthonormal system in M’ (such exists by Theorem 12, 
since M’ is a subspace). Recalling (26), we see that the union of {¢,} 
and {¢,} is a complete orthonormal system in H. | 


CorOLLARY 2. Let M be a subspace of a Hilbert space H, and let 
M' =H © M. Then M' has codimension n if M has dimension n and 
dimension n if M has codimension n. 


Proof. An immediate consequence of the representation (26) and 
Theorem 2, p. 122. | 


Let M be a subspace of a Hilbert space H, with orthogonal complement 
M' = H © M. If every vector fe H can be represented in the form 


fHhsth (he M, h'e M’), 
we say that H is the direct sum of the orthogonal subspaces M and M’, and 
write 
H=M@M’. 
The concept of a direct sum generalizes at once to the case of any finite or 


even countable number of subspaces: Thus # is said to be the direct sum 
of the subspaces M,, M2,..., M,,... and we write 


H=M,®M,®:::-®M,@::: 
if 
1) The subspaces M; are pairwise orthogonal, i.e., every element in M, 
is orthogonal to every element in M;, whenever j 4 k; 
2) Every element f¢ H has a representation of the form 
fat th to thype (21) 
where hh, € H, (n = 1,2,...). 
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It is easy to see that the representation (27) is unique if it exists and that 


If? =X l,l? 
. "| n=1 LA 
(give the details). 

Besides direct sums of subspaces, we can also talk about direct sums of a 
finite or countable number of Hilbert spaces. Thus, given two Hilbert spaces 
A, and H,, by the direct sum 

H=H,@H, 


is meant the set of all ordered pairs (/,, hz) with h, € H,, h, € Hy, where 
linear operations and the scalar product in H are defined by 


(hy, he) + (hy, he) = (hy + Ay, he + he), 
a(hy, he) = (ahy, ahs), 
(CA, he), (Ai, 2) = Chas hi) ++ (hes he). 
Consider the subspace of H consisting of all pairs of the form (/,, 0) and 
the subspace consisting of all pairs of the form (0, /,). Then clearly these 
two subspaces are orthogonal and can be identified in a natural way with Hy, 


and Hp, respectively. More generally, given any Hilbert spaces H,, H2,..., 


H,,,..., by the direct sum 


H=H,@H,©:°:-@®H,@®:°-°: 
is meant the set of all sequences 
h= (hy, hes... 5 Rgs-+) (An © Ay) 
such that 
co} 
> lAall? < ©, 
n=1 


with linear operations defined in the obvious way and the scalar product of 
two elements h = (hy, ho,...,hy,-.-)s & = (81, Ba, +++ > Bn» +++) defined by 


(h, 8) = 3 (lt 80) 


16.8. Characterization of Euclidean spaces. Given a normed linear space 
R, we now look for circumstances under which R is Euclidean. In other 
words, we look for extra conditions on the norm of R which guarantee that 
the norm be derivable from some suitably defined scalar product in R. 


THEOREM 15. A necessary and sufficient condition for a normed linear 
space R to be Euclidean is that 


If + gl? + IF — gl? = 20? + Ig?) (28) 
for every f, gER. 
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Proof. Thinking off + g and f — gas the “diagonals of the parallelo- 
gram in R with sides fand g,”’ we can interpret (28) as the analogue of a 
familar property of parallelograms in the plane, i-e., the sum of the 
squares of the diagonals of a parallelogram equals the sum of the 
squares of its sides. The necessity of (28) is obvious, since if R is 
Euclidean, then 


If+el? + 1f-¢eP=C+eft+o4+¢-2,f-8 
=AN+2hO+eHo+C6/ 
— 2(f, 2) + @, 2) 


= 2(I fll? + Ilgll?). 
To prove the sufficiency of (28), we set 
(f, 8) = Xf + gl? — If — gl?) (29) 


and show that if (28) holds, then (29) has all the properties of a scalar 
product listed on p. 142. Since (29) implies 


CFA) = laf? — IF—Ff Il?) = II, (30) 


the scalar product (29) clearly generates the given norm |{-|| in R. More- 
over, it follows at once from (29) and (30) that 
1) (4, f) > 0 where (f, f) = 0 if and only if f = 0; 


2) (f.8) = &.S). 
The proof of the linearity properties 
(f+ 8,4) = (fA) + (g, A) (31) 
and 
(af, g) = a(f, g) (32) 


requires a little work. To prove (31), consider the function of three 
vectors 


Of, g,h) = 417 +8,h) — (4) — AD, 
or equivalently 
Of g,) = Ift+eg +Al?—If—g— al? — IF +All? + IS — AlP 
— llg +All? + lg — All? (33) 
after using (29). It follows from (28) that 
Iftg +A? =2 f+ Al? +2 Ie? —IFfth—glP. (4) 
Substituting (34) into (33), we get 
VOLO H-lf Peas Eliya h— shoe ale 
— | f— Al? —llg + All? + lg — All’. (35) 
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Taking half the sum (34) and (35), we find that 


O(f, g,h) = slg +44 fl? + lg +A — SI) 

—#llg —A + fl? + lle — A — fil*) 

lg — hl? + lle — All, 
which becomes 

(fF, gh) = (Ig + Al? + If) — (lg — Al? = If) 
—lig + Al? + lig — Al? =0 
after applying (28) to both expressions in parentheses. But O(f, g, h) =0 
is equivalent to (31). 
To prove (32), we introduce the function 
90) = (of.8) — ch 8)s 


where f and g are fixed but arbitrary elements of R. It follows at once 
from (29) that 


9(0) = H(lgi? — igi?) = 0 
and 9(—1) = 0, since (—f, g) = —(f, g). Hence, given any integer 7, 
(nf, 8) = (sgnn(f +--+ +f),8) = sgnnl(f,g) +-°° + ha)l 
= In| sgn n(f, g) = nF 8), 
i.e., o(”) = 0. Moreover, given any integers p, q (q 4 0), 


(iF :) ~ (ih :) =Fa(ihs) = Phe) 


ie., p(c) = 0 for all rational c. But ¢(c) is a continuous function of c 
(why ?), and hence 9(c) = 0, which is equivalent to (32). § 


Example 1. The n-dimensional space R°, equipped with the norm 


n 1/p 
Ill = (2) 
k=1 


is a normed linear space if p > 1 (see Example 10, p. 41) and a Euclidean 
space if p = 2 (see Example 1, p. 144). However, R® fails to be Euclidean 
if p 2. In fact, for the two vectors 


f=G,1,0,..., 0), 

g = (1, —1,0,...,0), 
we have 

f+2g=(,0,0,...,0), 

f-—g=(,2,0,...,9), 
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and hence 
fll, = Nelle = 2”, = If +e = If—egl =2. 
Therefore the “parallelogram condition” (28) fails if p 4 2. 


Example 2, Consider the space Cy, ./5; of all functions continuous on the 
interval [0, x/2], and let 


f@®=cost, g(t)=sint. 


Then 
If = lg =1, 
and 
f+ gil = max |cos t+ sin t| = ,/2, 
Ox<t<n/2 
If — gil = max |cos t — sin t| = 1. 
0<t<n/2 
Therefore 


If + gl? + If — gl? A 2CF 1? + lig). 


It follows that the norm in C,, 2; cannot be generated by any scalar product 
whatsoever, i.e., the space Cj, ../. fails to be Euclidean. It is easy to see that 
the same is true of the space C;,,,, for any a and b (a < 6). 


16.9. Complex Euclidean spaces. Besides real Euclidean spaces, we can 
also consider complex Euclidean spaces, i.e., complex linear spaces equipped 
with a scalar product. However, we must now modify the properties of the 
scalar product listed on p. 142, since in the complex case these properties 
are contradictory as they stand. In fact, it follows from properties 2) and 
3), p. 142 that 

(Ax, Ax) = A®(x, x), 
and hence, after choosing 4 = i, that 

(ix, ix) = —(, x); 
i.e., the norms of the vectors x and ix cannot both be positive, contrary to 
property 1). To remedy this difficulty, we define the scalar product in a 


complex Euclidean space R as a complex-valued function (x, y), defined for 
every pair of elements x, y € R, with the following properties: 


1’) (x, x) > O where (x, x) = 0 if and only if x = 0; 


2’) (x, y) = Y, x); 
3°) Cx, y) = A, y); 
4’) (xy +2) = (% 2) + 2) 
(valid for all x, y, z€ R and all complex A). It follows from 2’) and 3’) that 
(x, Ay) = Ay, x) = ACY, x) = AG, y) 
(as usual, the overbar denotes the complex conjugate). 


164 LINEAR SPACES CHAP. 4 


Example 1. The space C” introduced in Example 2, p. 119 becomes a 
complex Euclidean space if we define the scalar product of two elements 
X = (1,-6.6 Xy)o VY = ay - + + > Yn) in C” as 


n 
(x, y) = DxTe- 
k=1 


Example 2. The complex space /, with elements x = (x1, Xg,...5 Xm +++) 
V = Oi Yass + > Ver+++)y +++, Where 


~ 2 < 2 
>| xal < ©, dYIyel < C,..., 
k=1 k=1 


becomes an infinite-dimensional complex Euclidean space when equipped 
with the scalar product 


fee} 
(x,y) = Dee 
R=1 


Example 3. The space Cf, ,, of all complex-valued functions continuous 


on the interval [a, b], equipped with the scalar product 


b — 
(f8) =] few at, 
is another example of an infinite-dimensional complex Euclidean space. 


The norm (length) of a vector in a complex Euclidean space is defined 
by the same formula 


Ix =V(@, x) 


as in the real case. However, the notion of the angle between two vectors 
x and y plays no role in the complex case, since the quantity 


(x, y) 
Ill yl 


is in general complex and hence cannot be the cosine of a real angle. On 
the other hand, the notion of orthogonality is defined in the same way as 
before, i.e., two elements x and y of a complex Euclidean space are said 
to be orthogonal if (x, y) = 0. 

Let {@,} be any orthogonal system in a complex Euclidean space R, and 
let f be any element of R. Then, just as in the real case, the numbers 


1 
ay = — (SF, $x) 
alee | aman 
and the series 


foo} 
by an Pn 
k=l 
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are called the Fourier coefficients and the Fourier series of the function /, 
with respect to the system {9,}. In the complex case, Bessel’s inequality 
(17’) becomes 


Silat Nol? < 11. 


If the system {¢,} is orthonormal, the Fourier coefficients become 


ay = Cy = (Fs x) 
and Bessel’s inequality simplifies to 


Sha 2 
Dlel? < fll. 
k=l 


By a complex Hilbert space is meant a complex Euclidean space which is 
complete, separable and infinite-dimensional. Theorem 11 carries over at 
once to the complex case, with isomorphism being defined exactly as in 
Definition 6: 


THEOREM 11’ (Isomorphism theorem). Any two complex Hilbert spaces 
are isomorphic. 


Proof. This time show that every complex Hilbert space is isomorphic 
to the complex space ],, the “coordinate realization” of a complex 
Hilbert space. | 


Remark. As an exercise, the reader should state and prove the complex 
analogues of all the other theorems of Sec. 16. 


Problem J. Prove that in a Euclidean space, the operations of addition, 
multiplication by numbers and the formation of scalar products are all 
continuous. More exactly, prove that if x, +x, y, > y (in the sense of 
norm convergence) and A,, — A (in the sense of ordinary convergence), then 


Xn + Yn >~X+Y, MaXn 7 AX, (Xn Vn) > (x,y). 
Hint. Use Schwarz’s inequality. 


Problem 2. Let R be the set of all functions f defined on the interval [0, 1] 
such that 


1) f(t) is nonzero at no more than countably many points t,, t2,... 5 


2) PAO < 0. 


Define addition of elements and multiplication of elements by scalars in the 
ordinary way, i.e., (f + g)(t) = f(t) + g(t), (@/)() = a f(t). If f and g are 
two elements of R, nonzero only at the points t,,%,... and ti,t,,..., 
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respectively, define the scalar product of f and g as 


(ha= > fleet). 


Prove that this scalar product makes R into a Euclidean space. Prove that R 
is nonseparable, i.e., that R contains no countable everywhere dense subset. 


Problem 3. Give an example of a (nonseparable) Euclidean space which 
has no orthonormal basis. Prove that a complete Euclidean space (not 
necessarily separable) always has an orthonormal basis. 


Problem 4. Prove that every nested sequence of nonempty closed bounded 
convex sets in a complete Euclidean space (not necessarily separable) has a 
nonempty intersection. 


Comment. Cf. Problem 6, p. 66 and Problem 2, p. 141. 


Problem 5. Given a Euclidean space R, let 91, po, ..., 9, -.- be an 
orthonormal basis in R and f an arbitrary element of R. Prove that the 
element 


n 
f- 24x %% 
k=1 
is orthogonal to all linear combinations of the form 


PAL 
if and only if 
a, = (f; x) (kK =1,2,...,0). 


Problem 6. According to elementary geometry, the length of the perpen- 
dicular dropped from a point P to a line L or plane II is smaller than the 
length of any other line segment joining P to L or I. What is the natural 
generalization of this fact to the case of an arbitrary Euclidean space? 


Hint. Use Theorem 6 and the result of the preceding problem. 


Problem 7. Let R be a complete Euclidean space (not necessarily separ- 
able), so that R has an orthonormal basis {¢,}, by Problem 3. Prove that 
every vector fe R satisfies the formulas 


f= ZFu 9) ISI? = ZIG ead’, 


where neither sum contains more than countably many nonzero terms. 


Problem 8. Give an example of a Euclidean space R and an orthonormal 
system {¢,,} in R such that R contains no nonzero element orthogonal to every 
Q,, even though {¢,,} fails to be complete. 
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Comment. By Theorem 10, R cannot be complete. 


Problem 9. Given a Euclidean space R, not necessarily complete, let R* 
be the completion of R as defined in Sec. 7.4. Define linear operations and 
the scalar product in R* by “continuous extension’’ of those in R © R*. 
More exactly, if x, > x, Yn > y where x, y, € R, let 


x+y=lim(x, + yn), «x =limax,, (x, y) =lim(x,, yz). 
no n~> oO n> oO 
Prove that 


a) These limits exist and are independent of the choice of the sequences 
{xn}, {Yn} in R converging to x and y; 
b) R®* is itself a Euclidean space.. 


Complete Cj,,,, in this way, and show that the resulting space is a Hilbert 
space. 


Comment. The elements belonging to the completion of C?, ,, but not to 
Cf») ate themselves functions, in fact discontinuous functions whose squares 
are Lebesgue-integrable on [a, 5], as defined in Sec. 29. 


Problem 10. Prove that each of the following sets is a subspace of the 
Hilbert space /,: 


a) The set of all (x1, X2,...,Xz,---) € dp such that x, = x9; 
b) The set of all (x1, X2,...,X,,---) €/, such that x, = 0 for all even k. 


Problem 11. Show that every complex Euclidean space of finite dimension 
n is isomorphic to the space C” of Example 1, p. 164. Generalize Problem 9 
to the case where C7, ,) is the complex space of Example 3, p. 164. 


17. Topological Linear Spaces 


17.1. Definitions and examples. Specification of a norm is only one way 
of introducing a topology into a linear space. There are many situations in 
analysis, notably in the theory of generalized functions (to be discussed 
in Sec. 21), where it is desirable to use other methods of equipping a linear 
space with a topology: 


DEFINITION 1. By a topological linear space is meant a set E with the 
following properties: 


1) E is a linear space; 

2) E is a topological space; 

3) The operations of addition of elements of E and multiplication of 
elements of E by numbers (real or complex) are continuous with 
respect to the topology in E, in the sense that 
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a) IfZ = Xo + Yo» then, given any neighborhood U of the point 2), 
there are neighborhoods V and W of the points x9 and yo, 
respectively, such that x + y € U whenever x EV, ye W; 

b) If % x9 = Yo, then, given any neighborhood U of the point yo, 
there is a neighborhood V of the point xy and a number « > 0 
such that ax € U whenever x € V, |x — %| < e. 


THEOREM 1. Let E be a topological linear space, and let U be any 
neighborhood of zero. Then the set 


U+x,= {yy =x +x, x € U} 
is a neighborhood of x9. Moreover, every neighborhood of xq is a set of this 
form, i.e., some neighborhood of zero “shifted by the vector Xo.” 


Proof. It follows from property 3a) that the mapping f(x) = x — x 
carrying £ into itself is continuous. Hence, by Theorem 10, p. 87, the 
preimage f—1(U) of any neighborhood U of the point zero is itself a 
neighborhood. But f—\(U) = U + x9. Therefore U + xp is a neighbor- 
hood, obviously of the point x). Similarly, given any neighborhood V 
of the point x9, let U = V — xy = V + (—x»). Then U is a neighbor- 
hood of zero, by the continuity of the mapping g(x) = x + x9. But 
clearly U+x,=V. | 


Remark. Thus the topology in £ is determined by giving a neighborhood 
base at zero, i.e., a system % of neighborhoods of zero with the property 
that, given any open set G © Econtaining the point zero, there is a neighbor- 
hood N € % contained in G. In fact, the mapping f(x) = x + xp carries a 
neighborhood base at zero into a neighborhood base at x9. Hence % 
and its “translates,” i.e., the system of all sets of the form {V:V = U + x, 
Ue, x € E}, constitute a base for the topology in E. In this sense, % 
“generates’’ the topology in E. 


Example I. Every normed linear space is clearly a topological linear 
space. In fact, it is an immediate consequence of the properties of a norm 
that the operations of addition of vectors and multiplication of vectors by 
scalars are continuous with respect to the topology “induced” by the norm. 


Example 2. Let R® be the linear space of all numerical sequences x = 
(xy, ..- X43 ++ -), real or complex, and let % consist of all sets of the form 


Un... oy kee = {x:x ER®, IXp,l<e,...5[%,,| < s} 
for some number « > 0 and positive integers k,...,k,. Then R® becomes 
a topological linear space when equipped with the topology generated by 
MN. 


1 As an exercise, verify that 4 and its translates satisfy Theorem 2 (or Theorem 3) 
of Sec. 9.3 and that the linear operations in R® are continuous with respect to the topology 
generated by 44. 
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Example 3. Let Kj,; be the linear space of all infinitely differentiable 
functions on the interval [a, b],2° and let % consist of all sets of the form 


U, . = {9:9 © Kja,op 19 ()| <€,..- 5 19°%(x)| < © for all x € [a, b]} 


for some number ¢ > 0 and positive integer r. Then K,,,,; becomes a topo- 
logical linear space when equipped with the topology generated by this 
neighborhood base (again supply some missing details). 


DEFINITION 2. A subset M of a topological linear space E is said to be 
bounded if, given any neighborhood U of zero, there is a number « > O such 
that M © «U = {z:z = ax, x € US. 


DEFINITION 3. A topological linear space E is said to be locally bounded 
if it contains at least one nonempty bounded open set. 


THEOREM 2. Every normed linear space E is locally bounded. 


Proof. Given any « > 0, the set of all x € E such that |x| << is 
obviously nonempty, bounded and open. § 


DEFINITION 4. A topological linear space E is said to be locally convex 
if every nonempty open set in E contains a nonempty convex open subset. 


THEOREM 3. Every normed linear space E is locally convex. 


Proof. Merely note that every nonempty open set in E contains an 
open sphere. | 


Remark. It follows from Theorems 2 and 3 that every normed linear space 
is both locally bounded and locally convex. Conversely, it can be shown that 
every locally bounded and locally convex topological linear space satisfying 
the first axiom of separation is normable, in the sense that E can be equipped 
with a norm ||-|| generating the given topology in E, via the metric e(x, y) = 
Ix — yl 


17.2. Historical remarks. For some time it was thought that the concept 
of a normed linear space (introduced in the thirties, notably in the work of 
Banach) was general enough to serve allthe concrete needs of analysis. 
However, it subsequently became apparent that this was not so and that 
there are a number of problems involving such spaces as the space of in- 
finitely differentiable functions, the space R®° of all numerical sequences, 
etc., in which the natural topology cannot be specified in terms of any norm 
whatsoever. Thus topological linear spaces, as opposed to normed linear 





20 A function 9 is said to be infinitely differentiable if it has derivatives 9™) of all orders 
k =0,1, 2,... (the zeroth derivative 9) is just the function 9 itself). 

31 A sequence {x,} of points in E is said to be bounded if the set {x1, Xa,...,Xns++-}s 
consisting of all terms of the sequence, is bounded. 
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spaces, are by no means “exotic” or “pathological.”’ On the contrary, some 
of these spaces are no less natural and important a generalization of finite- 
dimensional Euclidean space than, say, Hilbert space. 


Problem 1. Reconcile Definition 2 with Problem 1, p. 141 in the case 
where £ is a normed linear space. 


Problem 2. Let E be a topological linear space. Prove that 


a) If U and V are open sets, then sois U-+ V = {z:z=x+y,xeU, 
yey}; 

b) If Uis open, then so is aU = {z:z = ax, x € U} provided that « 4 0; 

c) If F< Eis closed, then so is «F for arbitrary «. 


Problem 3. Prove that a topological linear space is a 7,-space if and only 
if the intersection of all neighborhoods of zero contains no nonzero elements. 


Problem 4. Prove that a topological linear space E automatically has the 
following separation property: Given any point x € £ and any neighborhood 
U of x, there is another neighborhood V of x such that [V] < U. 


Hint. If U is a neighborhood of zero, then, by the continuity of sub- 
traction, there is a neighborhood V of zero such that ?* 


V—Ve{ziz=x—y, xEV,yEV} C U. 


Suppose y¢[V]. Then every neighborhood of y, in particular V + y, 
contains a point of V. Hence there is a point ze V such thatz+yeV. It 
follows that ye V—Vc U, 


Problem 5. Prove that a topological space T has the separation property 
figuring in Problem 4 if and only if for each point x ¢ T and each closed set 
F < T not containing x, there is an open set O, containing x and an open set 
O, containing F such that 0; N O, = @. 


Comment. Thus, for Ty-spaces, this separation property is “halfway 
between” that of a Hausdorff space and that of a normal space. 


Problem 6. Given a topological linear space E, prove that 


a) If {x,} is a convergent sequence of points in E, then the set M = 
{X1, Xo,...5Xn,---} is bounded; 

b) A subset M < E is bounded if and only if, given any sequence {x,} 
of points in M and any sequence {e,} of positive numbers converging 
to zero, the sequence {e,,x,,} also converges to zero. 





2 Here the minus sign in V — V does not have the usual meaning of a set difference 
(the same kind of notation was used in Sec. 14.5). 
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Problem 7. Prove that 


a) The space R® of Example 2, p. 168 is not locally bounded; 
b) Every locally bounded topological linear space satisfies the first axiom 
of countability. 


Problem 8. Let x be any point of a locally convex topological linear 
space E, and let U be any neighborhood of x. Prove that x has a convex 
neighborhood contained in U. 


Hint. It is enough to consider the case x = 0. Suppose U is a neighbor- 
hood of zero. Then there is a neighborhood V of zero such that V —V ¢ U, 
where V — V is the same as in the hint to Problem 4. Since E is locally 
convex, there is a nonempty convex open set V’ < V. Ifx,EV’, then V’ — Xo 
is a convex neighborhood of zero contained in U. 


Problem 9. Prove that an open set U in a topological linear space is 
convex if and only if U + U = 2U. 


Problem 10. Given a linear space E, a set U © E is said to be symmetric 
if x € U implies —x € U. Let & be the set of all convex symmetric subsets 
of E such that each coincides with its own interior. Prove that 


a) @ is a system of neighborhoods of zero determining a locally convex 
topology + in E which satisfies the first axiom of separation; 

b) The topology + is the strongest locally convex topology compatible 
with the linear operations in E; 

c) Every linear functional on £ is continuous with respect to v. 


Problem 11. Two norms ||-||, and ||-|l, in a linear space E are said to be 
compatible if, whenever a sequence {x,} in E is fundamental with respect 
to both norms and converges to a limit x € E with respect to one of them, it 
also converges to the same limit x with respect to the other norm. A linear 
space E equipped with a countable system of compatible norms ||-||,, is said 
to be countably normed. Prove that every countably normed linear space 
becomes a topological linear space when equipped with the topology 
generated by the neighborhood base consisting of all sets of the form 


U,,. = {x:x EE, |xlla<e,..., (xl, < e} (1) 
for some number « > 0 and positive integer r. 


Problem 12. Prove that each of the following spaces is countably normed, 
ie., in each case verify the compatibility of the given system of norms ||-||,,: 


a) The space K,, 4; of infinitely differentiable functions on [a, ], equipped 
with the norms 
Ife = sup [FPO (n= 0,1, 2,...)5 (2) 
astSb 


0<k<n 
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b) The space S,, of all infinitely differentiable functions f(t) on (— a, 0) 
such that f(t) and all its derivatives approach zero as |t| > oo faster 
than any power of 1/|t| (i.e., such that 12f'(r) + 0 as|t| — oo for 
arbitrary p and q), equipped with the norms 


fn = sup lef] = (n= 0,1,2,...); 
Dian 


c) The space ® of all numerical sequences X= (%,...,%,,.-.) such 
that 
dk x, 
k=1 
converges for alln = 0,1,2,..., equipped with the norms 





fe) 
xl, = [Sus (4 = 0515 Beas), 
k=1 


Show that (1) and (2) define the same topology in Kj,,,; as in Example 3, 
p. 169. 


Comment. ® might be called the space of “rapidly decreasing sequences.”’ 


Problem 13. A norm ||-||, is said to be stronger than a norm |||, if there is 
a constant c > 0 such that ||x||\> c ||x||, for all x € E (then ||-||, is said to 
be weaker than ||-||;). Discuss the norms (2) in this language. 


Comment. Two norms are said to be comparable if one is stronger than 
the other, and equivalent if one is both stronger and weaker than the other 
(cf. Problem 7, p. 141). 


Problem 14. Prove that every countably normed space satisfies the first 
axiom of countability. 


Hint. Replace the system of neighborhoods U, , by the subsystem such 
that « takes only the values 
ae acess: 
2 n 


(this can be done’ without changing the topology). 


Comment. Thus the topology in E can be described in terms of convergent 
sequences (recall Sec, 9.4). 


Problem 15, Prove that the topology in a countably normed space can be 
specified in terms of the metric 


255 Ss le las 
(x, y) 27Tth on, (x, y €£). (3) 
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First verify that p(x, y) has all the properties of'a metric, and is invariant 
under shifts in the sense that p(x + z, y + z) = e(x, y) for all x, y, EE. 


Comment. A countably normed space is said to be complete if it is 
complete with respect to the metric (3). 


Problem 16. Prove that a sequence {x,} in a countably normed space is 
fundamental with respect to the metric (4) if and only if it is fundamental 
with respect to each of the norms ||:||,. Prove. that {x,} converges to an 
element x € E with respect to the metric (3) if and only if it converges to 
x with respect to each of the norms |||, 


Comment. Thus, in particular, a countably normed space E is said to be 
complete if a sequence {x,} in E converges whenever it is fundamental with 
respect to each of the norms ||-||,. 


Problem 17. An. infinite-dimensional separable linear space H equipped 
with a countable system of scalar products (-,-), is said to be countably 


Hilbert if the norms 
I<|_, = V (x, Xn (x € H) 


generated by these scalar products are compatible and if the space H is 
complete. Prove that the space ® of Problem 12c is countably Hilbert when 
equipped with the scalar products 


cy 
(x, Yn = 2 xe (n = 0,1,2,.. > 


where x = (X1,--- > Xx,++-)s ¥ = Gas-- +s Ves -»-) are any two elements of ®. 


Problem 18. The norms |l-||,, in a countably normed space E can be 
assumed to satisfy the condition 
Ixlk< Ixh if k<7, (4) 


since otherwise we can replace ||-||,, by 


Is, = sup {Il* ll Ils flees - +s Te Dade 


(Prove that this does not change the topology in E.) Let E,, denote the 
completion of E with respect to the norm ||-|,. Using (4), prove that 


E,> E,>:+:'>E, >: 
Clearly, 
ECNE,. 
n=1 


Prove that £ is complete if and only if 


E=f\E,. 


n=l 
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Problem 19. Let C{2),, be the space of all functions defined on the interval 
[a, b] with continuous derivatives up to order n inclusive, equipped with the 
norm 


If ln = sup If" 
<t<b 


O<hen 
(note that C{0,; = C,,)). Prove that C/"}) is complete. Prove that Kio,» 


equals the intersection 
Cy in) 
nN, 
Nn Cta,by> 
n=0 


and hence is complete (by Problem 18). 


e) 


LINEAR FUNCTIONALS 


18. Continuous Linear Functionals 


18.1. Continuous linear functionals on a topological linear space. A (real) 
functional f defined on a topological linear space E is said to be /inear on E if 


f(ax + By) = of (x) + BSG) 


for all x, y ¢ E and arbitrary numbers «, @ (recall Sec. 13.5), and continuous 
at the point x, € E if, given any « > 0, there is a neighborhood U of x, such 
that 

f(x) — f(%)| < € (1) 


for all x € U (recall Sec. 9.6). We say that the functional f is continuous (on 
E) if it is continuous at every point x, € E. 


THEOREM |. Let f be a linear functional on a topological linear space E, 
and suppose f is continuous at some point x, ¢ E. Then f is continuous on 
E, i.e., at every point of E. 


Proof. Given any point y¢E and any number < > 0, let U be a 
neighborhood of x, such that x € U implies (1). Then 


V=U+(y —%) = 2:2 = x+y — Xx € U} 


is a neighborhood of y, by Theorem 1, p. 168. Moreover, x € V implies 
x +X) — y € U and hence 


If) —fO) =F @ + x — y) —fEd)l <e, 
ie., fis continuous aty. §j 
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CorOLiary. The continuity of a linear functional on a topological linear 
space need only be checked at a single point, for example, at the point zero. 


THEOREM 2. Let f be a linear functional on a topological linear space E. 
Then f is continuous on E if and only if f is bounded in some neighborhood 
of zero} 


Proof. Suppose fis continuous on £, in particular at the point zero. 
Then, given any ¢ > 0, there is a neighborhood of zero in which 
(f(*)| <. Obviously, fis bounded in this neighborhood. 

Conversely, suppose f is bounded in some neighborhood U of zero, 
so that | f(x)| < C for all x € U, where C is a suitable constant. Then, 
given.any « > 0, we have | f(x)| < « for all x in the neighborhood 


gua e2=Exxeul, 
Cc C 


ie., fis continuous at zero and hence on allof £. J 


THEOREM 3. A necessary condition for a'linear functional f to be 
continuous on.a topological linear space E is that f be bounded on every 
bounded set. The condition is also sufficient if E satisfies the first axiom of 
‘countability. 


Proof. To prove the necessity, suppose fis continuous on E. Then f 
is bounded in some neighborhood U of zero: 


If@1<C  (xeU). 


Let M © E be any bounded set, as defined in Definition 2, p. 169. Then 
Mc aU for some « > 0, and hence 


If(x)|< Ca (xe M), 


ie., fis bounded on M. 
’ As for the sufficiency, let {U,} be a countable neighborhood base at 
the point zero such that 


U,> Uz, >°+++> U,>°-°° 


(cf. the proof.of Theorem 7, p. 84). If f fails to be continuous on E£, it 

cannot be bounded on any of these neighborhoods of zero. Therefore in 

each U,, there is a point x, such that |f(x,,)| > 1. The sequence {x,} is 
bounded (recall footnote 21, p. 169), and even converges to zero, while 
the sequence {f(x,)} is unbounded.. But then f fails. to be bounded.on 
-the bounded set {x,, x),...,Xn,...}, contrary to hypothesis. -. fj 


Guided by Theorem 3, we introduce 





+ Recall footnote 14, p. 110. 
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DEFINITION 1. Given a linear functional f on a topological linear space 
E, suppose f is bounded on every bounded subset of E. Then f is said to be 
a bounded linear functional. 


Remark, In general, a bounded linear functional need not be continuous. 


18.2. Continuous linear functionals on a.normed linear space. Suppose 
E is a normed linear space, so that in particular E satisfies the first axiom of 
countability (recall the remark on p. 83). Then, by Theorem 3, a linear 
functional on E£ is continuous if and only if it is bounded. But by a bounded 
set in a normed linear space we mean a set contained in some closed sphere 
\|x|| < C (recall Problem 1, p. 141). Therefore a linear functional f on a 
normed linear space is bounded (and hence continuous) if and only if it ‘is 
bounded on every closed sphere ||x|| < C, or equivalently on the closed unit 
sphere ||x|| < 1, because of the linearity of f In other words, f is bounded 
if and only if the number 


If = sup [f@I (2) 
J : Nell <1 
is finite. 
DEFINITION 2. Given a bounded linear functional f on a normed linear 


space E, the number (2), equal to the least upper bound of |f(x)| on the 
closed unit sphere ||x|| < 1, is called the norm of f. 


THEOREM 4. The norm || f || has the following two properties: 


a. Il 
fl < lf In aes all xe E. (4) 


Proof. Clearly, 
If ll = sup |f()| = sup |fI 
le ||<2 lel=1 


(why ?). But the set of all vectors in E of norm 1 coincides with the set 
of all vectors 


al (xe E,x #0), 2 
x 
and hence 
7 ap LOI. 
Lit = stp il sep Vf (ja) |"? o40 xl 








which proves (3). ee since the vectors (5) all have norm 1, it 


follows from (2) that 
(3) a <Ifl  (xeE,x #0), 


which implies (4) for x # 0. The validity of (4) for x = Ois obvious. §j 








178 LINEAR FUNCTIONALS CHAP. 5 


Example 1, Let R” be Euclidean n-space, and let a be any fixed nonzero 
vector in R”. Then the scalar product 
ff) =(%, a) (ER) 


defines a functional on R” which is obviously linear. By Schwarz’s inequality, 


IF = 1G, a) < [ll Hall. (6) 
Therefore f is bounded and hence continuous on R*. It follows from (6) that 
a < lal (#0). (7) 


The right-hand side of (7) is independent of x, and hence 
If) 


sup ~—— < |la], 
«#0 |x| 


fll < lal. 


ie., 


But choosing x = a, we get 


If(@)| = I(@, a)| = llall?, 


or equivalently 


It follows from (3) that 
Ifill = lla. 


Example 2, More generally, let R be an arbitrary Euclidean space, and 
let a be a fixed element of R. Then the same argument as in the preceding 
example shows that the scalar product 


f@) =(% a) (xe R) 


defines a bounded linear functional on R, with norm 


fll = llall. 
Example 3. The integral 


I(x) =]"x( at 


is a linear functional on the space C,, ,. Since 


uel =| [Px de| < max |s(9]  — a) = Il (6 — a), 


where the equality holds if x(t) = const, we see that the functional J is 
bounded, with norm 
7] =b — a. (8) 
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Example 4. More generally, let yo(t) be a fixed function in C,, ,,, and let 


I(x) = J x(yo(#) dt, 


Then J is a linear functional on Cj,,,). Since 


Wool = f*xyo ae| < teh Ply ae, 


where the equality holds if x(t) = const, the functional J is bounded, with 
norm 


In) = fly ae. () 


Note that (9) reduces to (8) in the case y)(t) = 1. 
Example 5. As in Example 3, p. 124, let 
3,,(%) = x(to) 
be the linear functional on C,,,,) which assigns to each function x(t) € Cja,» 
its value at some fixed point ty € [a, 5]. Clearly 


|x(to)| < max |x(t)| = ||xl, 
axt<b 


where equality holds if x(t) = const. Hence 3,, is bounded, with norm 
18,,]| = 1. 


The concept of the norm of a bounded linear functional on a normed 
linear space can be given a simple geometric interpretation. As shown in 
Theorem 4, p. 127, every nontrivial linear functional f can be associated 
with a hyperplane 

M, = {x:f(x) = 1}. 


Let d be the distance from the hyperplane M, to the point x = 0, defined as 


d = inf |x| 
f(a)=1 
(cf. Problem 9, p. 54). Since, as always 
IFO < IF Il lel, 
J (x) = 1 implies 
1 
|x| >—— (x E M,), 
If : 


1 
d>—. 10 
* il ” 
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On the other hand, it follows from (3) that, given any « > 0, there is an 
element x, such that f(x) = 1 and 





(fll — ©) xl < 1. 
Therefore 
d = inf |x|| < : F 
Fla)=1 Ifill —e 
and hence 
] 
d<—, (11) 
fll 


since « > 0 is arbitrary. Comparing (10) and (11), we get 


re an 
fl 
ie., the norm of the linear functional f equals the reciprocal of the distance 
between the hyperplane f(x) = 1 and the point x = 0. 


18.3. The Hahn-Banach theorem for a normed linear space. Let f,(x) be a 
linear functional defined on a subset L of a linear space E, satisfying the 
condition 


lfo(x)| < p(x), (12) 


where p is a finite convex functional on E. Then, according to the Hahn- 
Banach theorem (Theorem 5, p. 132), f) can be extended onto the whole 
space £ without violating the condition (12) As applied to bounded linear 
functionals on a normed linear space E, this result can be formulated as 
follows : 


THEOREM 5 (Hahn-Banach). Given a real normed linear space E, let 
L be a subspace of E and f, a bounded linear functional on L_ Then fy can 
be extended to a bounded linear functional f on the whole space E without 
increasing its norm, i.e., 


val on E = oll on Lt 


Proof. We need only choose the functional p in Theorem 5, p. 132 to 
be the convex functional k ||x||, where 


k= Ifoll on Le i 
This form of the Hahn-Banach theorem has a simple geometric interpreta- 
tion. The equation 
Sox) = 1 (13) 
specifies a hyperplane in the subspace L, at distance 
1 


fall 
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from the origin (the point x =0) The fact that the functional f, can be 
extended onto the whole space E without increasing its norm means that the 
hyperplane (13) can be extended to a larger hyperplane in the whole space 
E in such a way that the distance between the larger hyperplane and the 
origin is the same as the distance between the hyperplane (13) and the origin. 

In the same way, starting from the complex version of the Hahn-Banach 
theorem (Theorem 5’, p. 134), we get 


THEOREM 5’. Given a complex normed linear space E, let L be a 
subspace of E and fy a bounded linear functional on L. Then fy, can be 
extended to a bounded linear functional f on the whole space E without 


increasing its norm, .i.e., If i = Ifollon x 
on EZ ~ U/ollon Le 


In the case of an arbitrary topological linear space E, a nontrivial con- 
tinuous linear functional on E may not even exist. However, by imposing 
suitable restrictions on E, we can guarantee the existence of “‘sufficiently 
many’? continuous linear functionals on E.? 


DEFINITION 3. A topological linear space E is said to have sufficiently 
many continuous linear functionals if for each pair of distinct points 
X41, X, € E there exists a continuous linear functional f on E such that 
ST (%:) Af (x2), or equivalently, if for each nonzero element x, € E there 
exists a.continuous linear functional on E such that f(x.) 4 0. 


THEOREM 6. Every normed linear space E has sufficiently many con- 
tinuous linear functionals. 


Proof. Given any nonzero element x,¢£, we define a linear 
functional 
So(A%o) = A 
on the set L of all elements of the form Ax». We then use the Hahn- 


Banach theorem to extend f, onto the whole space E. This gives a 
continuous linear functional on £ such that f(x) = 140. J 


Problem 1. Prove that a functional f on a T,-space E is continuous at a 
point x € £ if and only if x, — x implies f(x,) — f(x). 


Problem 2. Prove that every linear functional on a finite-dimensional 
topological linear space is automatically continuous. 


Problem 3. Let E be a topological linear space. Prove that a linear 
functional f on £ is continuous if and only if 


a) Its null space {x:f(x) = 0} is closed in E; 
b) There exists an open set U © E and a number ¢ such that ¢ ¢ f(V). 


2 See Theorem 6 and Problems 7-8, 
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Problem 4, Given a topological linear space E, prove that 


a) If every linear functional on £ is continuous, then the topology in 
E is the topology t of Problem 10, p. 171; 

b) If £ is infinite-dimensional and normable, then there exists a non- 
continuous linear functional on E; 

c) If E has a neighborhood base at zero whose power does not exceed 
the algebraic dimension of £, then there exists a noncontinuous linear 
functional on E. 


Hint. In b) use the existence of a Hamel basis in E (recall Problem 4, 
p. 128, where algebraic dimension is also defined). 


Problem 5. Prove that 
f(x) = ax(0) + bx(1), 


a(x) = Px at — J x(n dt 


are both bounded linear functionals on the space Cio; What are their 
norms? 


Problem 6. As in Problem 11, p. 171, let E be a countably normed space 
with norms |-||,,, where 


elle ele eee xl (14) 


(as in Problem 18, p. 173, this condition entails no loss of generality). 
Let E* be the set of all continuous linear functionals on E, and let E* be 
the set of all linear functionals on E which are continuous with respect to 
the norm ||-||,,.. Prove that 


ESC EFS CEC 
and 


E* = U Ez. (15) 
‘on 


Hint. If f is a continuous linear functional on E, then, by Theorem 2, 
there is a neighborhood U of zero in which f is bounded. It follows from 
(14) and the definition of the topology in E that there is a number « > 0 and 
a positive integer k such that the open sphere ||x|l;,, << © is contained in U. 
Being bounded on this sphere, f is bounded and continuous with respect to 
the norm |.|I,. 


Comment. Let f be a continuous linear functional on E, i.e., let fe E*. 
Then by the order of f is meant the smallest integer n for which fe E%. It 
follows from (15) that every continuous linear functional on E is of finite 
order. 
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Problem 7. Prove that every countably normed space E has sufficiently 
many continuous linear functionals. 


Hint. Given any nonzero element x, € E, use Theorem 6 to construct a 
linear functional f continuous with respect to the norm ||-||, such that 


S(%) #9. 


Problem 8. Show that every real locally convex topological linear space 
E satisfying the first axiom of separation has sufficiently many continuous 
linear functionals. 


Hint. Given any nonzero element x, € £, show that there is a convex 
symmetric? neighborhood U of zero such that x»¢U. Let py be the 
Minkowski functional of U. Then, as in the proof of Theorem 6, p. 136, 
Py isa finite convex functional on E£ such that py(—x) = py(x) and 


Pui) <1 if xeU,  py(x)> 1. 


Define a linear functional f,(Ax.) = A on the set L of all elements of the 
form Ax. Clearly | fo(x)| < po(x) on L and fo(xo) = 1. Now use the Hahn- 
Banach theorem to extend f, onto the whole space E. 


Comment. The importance of locally convex spaces is mainly due to this 
property (which continues to hold in the complex case). 


19. The Conjugate Space 


19.1. Definition of the conjugate space. The operations of addition of 
functionals and multiplication of functionals by numbers are defined in the 
obvious way: 


DEFINITION 1. Let f and g be two functionals defined on a topological 
linear space E, and let « be any number. Then by the sum of f and g, 
denoted by f + g, is meant the functional whose value at every point x € E 
is the sum of the values of f and g at x, while by the product of « and f, 
denoted by af, is meant the functional whose value at every point x € E is 
the product of « and the value of f at x. More concisely, 


(f+ 2)(%) =f) + 8), 
af (x) = af (x) 


for every x EE. 


Clearly, if fand g are linear functionals, then so are f + g and af. More- 
over, if f and g are bounded (and hence continuous), so are f + g and af. 


* Recall Problem 10, p. 171. 
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Let E* be the set of all continuous linear functionals on E. Then the 
space E*, called the conjugate space of E, is itself a linear space, when 
equipped with the operations of addition of functionals and multiplication 
of functionals by numbers, This can be seen at once by verifying the three 
axioms in Definition 1, p. 118. Note that the zero element in E* is the 
functional f = 0, equal to zero for all x € E. 

The next step is to introduce a topology in E*, besides the linear operations 
just described. This can be done in various ways. First we consider the 
particularly simple case where the original space E is a normed linear space. 


19.2. The conjugate space of a normed linear space. Let f be a continuous 
linear functional on a normed linear space E. In Sec. 18.2 we introduced the 
concept of the norm of /, equal to 


x 
Ill = sup £2 
a#o {|x| 
(recall Theorem 4, p. 177). This quantity clearly has all the properties of a 
norm, as listed on p. 138. In fact, 


1) [Ll > 0 where || f|| = 0 if and only if f = 0; 
2) ofl = lol Is 
3) If +2ll < Ifill + llgil, since obviously 


sup Ft 8! — VOI, gp el. 

2#0 {|x| 2#0 |x| e#o ||x|| 
Hence the space E* conjugate to E can be made into a normed linear space 
by simply equipping each functional fe E* with its norm || /|||. The corre- 
sponding topology in E£* is called the strong topology in E*. In cases where 
we want to emphasize that E* is equipped with the norm ||-||, we will write 
(E*, ||-|[) instead of E*. 


Example 1. Let E be Euclidean n-space (real or complex), and let 
€;,...,@, be any basis in £, so that every vector x € E has a unique repre- 
sentation of the form 

x = > xpep- 
k=1 
If fis a linear functional on E£, then clearly 


fx) = flere () 


Thus a linear functional on E is uniquely determined by its values on the 
basis vectors e,,...,@,,; Where these values can be assigned arbitrarily. 
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Consider the linear functionals f,, ... , f,, defined by 


Fe) 1 if j=k, 

(eé. —— 

“lo if GFR. 

It is clear that these functionals are linearly independent, and moreover that 
Six) = X;. 


Hence we can write (1) in the form 


fe) =¥ fled fe. 


Thus the functionals f{,...,/, form a basis in the space E*, called the dual 
of the basis e,,...,¢, in the original space E. Therefore E* is itself an 
n-dimensional linear space. Of course, different norms in E “induce” 
different norms in E* (see Problem 1). 


Example 2. Let cy be the space of all sequences x = (%),..., %,,-.-) 
converging to zero, with norm 


I|xl] = sup ||x,ll- 
k 


Then the space (cf, ||-|]) conjugate to cp is isomorphic (see footnote 17, 
p. 155) to the space /, of all absolutely summable sequences f= (fi,..-, 
ie ++ +).4 with norm 


If = A 


To prove this,-we first note that, given any element f = (f1,...5 fi...) Eh» 
the formula 


fey = 3h, Q) 


defines a functional f on the space co, where f is clearly linear. Moreover, 
it follows from (2) that 


FGI < IsLS If 
and hence - 


Ifill < IST. (3) 


~ 4A sequence {f,}, or f = (fi, «6s fis «+ .. in “point notation,” is said to be absolutely 
summable if 


Ms 


ke 


fal <i o, 
1 
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Consider the vectors 


re 


in Co, and let 


xi) a . Se ey 
=I | fel 
(if f, = 0, we set f;/|f,| = 0). Then x € cy, and 
[x] <1. (4) 
Moreover 
FO!) =3 © fey =SIfl 
k=1 | fil k=l 
so that 
lim F(x) = 2 lil = fll. (5) 
It follows from (4) and (5) that \ 
IF > WI (6) 
(why ?). Comparing (3) and (6), we get 
iF = Wf. 


Thus the mapping carrying f into f is a “norm-preserving” mapping of 
I, into cf. We must still verify that this mapping is one-to-one and “onto” 
(see p. 5), ie., that every functional fe c* has a unique representation of 
the form (2), where f= (fi,...5 fis») EA. Let x = (%4,..., X%,...) Ely 
Then 


co 
x= Y xen» 
k=l 


where the series on the right converges in cy to the element x, since 














n 
x — > x,e, || = sup |x,| > 0 
k=1 k>n 


asn -> «. Since the functional /€ c* is continuous, 


ne) => xp Fler) 


(where is the continuity used ?). Hence f has a unique representation of the 
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form (2), and we need only verify that 


Zlfedl < 0. 
This time let 
x — = Fle), 
k=l Fe) (ed 
Noting that x!” € cy and [|x| < 1, we find that 


fe) 
Sife @)| = re I 


But this implies (7), since n can be made arbitrarily large. 


Fa) =F) < Wil. 


Whether or not the original space E is complete, we have 


THEOREM 1. The conjugate space (E*, ||-||) is complete. 
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(7) 


Proof. Let {f,,} be a fundamental sequence of functionalsin E*. Then, 


given any « > 0, there is an integer N such that n,n’ > N implies 


Wor — full <6, 
|fn(x) — fr(*)] < Ifa — Frill [lll < © lll 


so that 


for every x € E. Therefore the sequence { f,,(x)} is fundamental and hence 


convergent for every x € E. Let 


f(s) = lim f,(2). 


Then / is linear, since 


f(ax + By) wee Slax + By) 
= [af(x) + Bfa(y)] = «f(x) + BF(y). 


Moreover, choosing x so large that || f, — fa4pll < 1 for all p > 0, we 


have || frail < If;l| + 1 for all p > 0, and hence 


Fn+ v1 < (Fall + 1) Ill. 
It follows that 


Tim fnro()! = IFC)! < fall +1) dl, 


so that fis bounded and hence continuous. 


To complete the proof, we now show that the functional fis the limit 


of the sequence {/f,}, ie., that 


lim | f, — fll = 0. (8) 
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Given any « > 0, let 7 be so large that 


fn —Savoll 7 (9) 


for all p > 0. By the definition of the norm in E*, there is a nonzero 
element x,,, € E such that 





Xn e) —S (Xn € € 
Pe ee ee Ce Oe ee 
Xn el 3 3 
where 
Xn, 
Unie ia 
IXn,€ll 
Therefore 


Wr =f < [fn(Un,e) =F na line) + fngp(U ne) — fn) | + ; 


< We —Sotoll ll nell + Faro ne) —L (Uno | + 


or 
2 : 
fn —F Il < Parone) —~LUne)| + = (10) 
after using (9) and the fact that |v, .|| = 1. But 
lim fa+o(Un,e) = f(Une), 
D0 


by the very definition of f Hence, taking the limit as p — © in (10), 
we get 
In — fll <e, 
which implies (8), since « > Ois arbitrary. 
Next we examine the structure of the space conjugate to a Hilbert space: 


THEOREM 2. Let H be a real Hilbert space. Then, given any x, € H, 
the formula 


F(x) = (%, %0) = (x € A) (11) 


defines a continous linear functional on H, with \|f || = ||xqll. Conversely, 
given any continuous linear functional f on H, there is a unique element 
Xo € H such that (11) holds, with ||xoll = | ll. 


Proof. Given any x)¢H, formula (11) obviously defines a linear 
functional on H. By Schwarz’s inequality, 


LF (x)| = 10% xo)| < [lll loll. (12) 
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so that f is bounded and hence continuous. Moreover ||f|| = ||xoll, 
because of (12) and the fact that f(x») = ||xoll. 

‘Conversely, let f be any continuous linear functional on H. If f= 0, 
then f obviously has the representation (11) with x, = 0 (in this case 
xoll = |f|| = 0). Otherwise, let 

Hy = {x:f (x) = 0} 

be the null space of f. Since fis continuous, Hy is a closed subspace of H. 
According to Theorem 3, Corollary 2, p. 126, the codimension of the null 
space of any nontrivial linear functional f equals 1. Therefore, by 
Theorem 14, Corollary 2, p. 159, the orthogonal complement Hj of the 
space Hy is one-dimensional, i.e., there exists a nonzero vector yo 
orthogonal to Hy such that every vector x ¢ H has a unique repre- 
sentation of the form 

x= ptr, (13) 
where y € Hy. Clearly, there is no loss of generality in assuming that 
llyoll = 1. Now let 

Xo =S (Vo)Po- (14) 


Then, given any x € H, we have 
L(x) =f + Ayo) = 4 On) 
because of (13), and 
(x, Xo) = A(Vos Xo) = Wd): Yo) = Mf (Vo) 


because of (14). Therefore (11) holds for all xe H. To prove the 
uniqueness of x9, suppose 


f(x) = (x, x4) (xe A). (11’) 
Then, subtracting (11’) from (11), we get 
(x,%0— x0) =0 (xeH), 
which immediately implies x, = X» after choosing x = x9 — x). Jj 


CorROLLARY. The correspondence x,< f is an isomorphism between 
H and H*, regarded as normed linear spaces. 


Proof. If 
I) a= (x, Xo)s g(x) rae (x, Yo)s 
then 
af (x) + Bg(x) = (x, ax» + Byo). 
Moreover ||xol| = ||f ll. Wf 


19.3. The strong topology in the conjugate space. Let E be a normed lin- 
ear space. Then as we have seen, the conjugate space E* is itself a normed 
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linear space, and a neighborhood of zero in E* means the set of all continuous 
linear functionals on E satisfying the condition || f || < « for some « > 0. In 
other words, for a neighborhood base at zero in the space E* we can take 
the set of all functionals in E* such that | f(x)| < «© when x ranges over the 
closed unit sphere ||x|| < 1 in the space EZ. Suppose E is a topological linear 
space, but not a normed linear space. Then in defining the topology in E* it 
seems natural to start from an arbitrary bounded set A © E, since there is no 
longer a “‘unit sphere.” This suggests 


DEFINITION 2. Let E be a topological linear space, with conjugate 
space E*. Then by the strong topology® in E* is meant the topology 
generated by the neighborhood base at zero consisting of all sets of the form 


U4. = {f: lf | < ¢ for all x € A} (15) 
for some number ¢ > 0 and bounded set A © E.8 


Regardless of the topology in the original set Z, we have 


THEOREM 3. The conjugate space E*, equipped with the strong 
topology, is a locally convex T,-space. 


Proof. lf foe E* and fy #0, then there is an element x) € E such 
that fo(xo) 4 0. Let 
e=3/fol, A = {xo}. 


Then clearly fo ¢ U4, and hence E* 1s a T,-space. To verify that the 
strong topology in E* is locally convex, we need only note that U, , is 
a convex set in E* for any « > 0 and any bounded set AC E. jj 


Remark. The strong topology in E* will be denoted by the symbol 3b. 
In cases where we want to emphasize that E* is equipped with the strong 
topology, we will write (E*, b) instead of E*. 


19.4. The second conjugate space. Since the set of all continuous linear 
functionals on a topological linear space £ is itself a topological linear space, 
namely the conjugate space (E*, b), we can also talk about the second 
conjugate space E** = (E*)*, i.e., set of all continuous linear functionals 
on E*, the third conjugate space E*** = (E**)*, and so on. 


THEOREM 4. Given a topological linear space E with conjugate space 
E*, let Xo be any fixed element of E. Then 


bef) = f%0) 


5 As opposed to the weak topology in E*, to be discussed in Sec. 20.3. 
6 See Problem 8. 
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is a continuous linear functional on E*. 
Proof. The linearity is obvious, since 


beglaf + Bg) = af (Xo) + Bg(%o) = xe) + Bb(g) (48 65"). 


As for the continuity, given any ¢ > 0, let A be a bounded subset of E 
containing xo, and let U, , be the neighborhood (15). Then 


Re) = If (xo) <e if fe U4.e 


i.e., the functional {),, is continuous at 0 and hence continuous on the 
whole space E*. J 


Thus the mapping 
n(x) = bef); 
called the natural mapping of E into E*, is a mapping of the whole space 
E onto some subset x(E) of the second conjugate space E**. Clearly 7 is 
linear, in the sense that 


(ax + By) = flax + By) = af (x) + BFQ) = an(x) + Br(y). 


Suppose E£ has sufficiently many continuous linear functionals, e.g., suppose 
E is a normed linear space or a locally convex topological linear space 
satisfying the first axiom of separation.’ Then x is one-to-one, since, given 
any two distinct elements x,, x, € E, there is a functional fe E* such that 
S (4) 4 f(x) and hence r(x,) 4 (x2). Being the conjugate space of (E*, 5), 
E** can also be equipped with a strong topology (introduced by the obvious 
analogue of Definition 2), which we denote by b*. 

If (EZ) = E**, the space E is said to be semireflexive. It can be shown 
(see Problem 9) that the inverse mapping x! carrying 7(£) into E is always 
continuous. If E is semireflexive and if (as well as z~‘) is continuous, 
the space E is said to be reflexive and x then establishes a homeomorphism 
between the space E and (E**, b*). In this case, each element x € E can be 
identified with the corresponding element (x) € E**, and hence it is con- 
venient to denote the value of a functional f ¢ E* at the point x € E by the 
more symmetric notation 

fo) =(f.». 


Thus (f, x) can be regarded as a functional on E for each fixed f ¢ E*, and as 
a functional on E* for each fixed x € E (in the latter case, x also acts like 
an element of E**). 


THEOREM 5. If E is a normed linear space (so that in particular E* 
and E** are also normed linear spaces), then the natural mapping of E 
into E** is an isometry. 


7 Recall Problem 8, p. 183. 
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Proof. Given an element x € E, let ||x|| denote the norm of x in E and 
xl, the norm of its image in E**. We want to show that ||x|| = |lxlls. 
To this end, let f be any element of E*. Then 


IA x1 < IFT lll, 


1.€., 
K(f, x)I 
Ix >—~—— (f¥#9), 
If ll f 
and since the left-hand side is independent of f, 
p (hal x)| 
Ill] > ot = lll (16) 
Pf ° 


On the other hand, by the Hahn-Banach theorem, for every x, € E there 
is a linear functional fy such that 


(fos x0) = ILfall Uxoll- (17) 


In fact, to construct such a functional, we need only set fo(x) = for any 
element of the form Axo, and then extend /, to a functional on the whole 
space E (without changing its norm). It follows from (17) that 


IF x)| 
|xllp = sup ~—~ > |x. (18) 
bee fh 
Comparing (16) and (18), we get’ 
xl = lle Wl 


COROLLARY. The concepts of semireflexivity and reflexivity coincide 
for a normed linear space. 


Proof. If the natural mapping x is an isometry, then obviously both 
mand x! are continuous. §. 


Remark. According to Theorem 5, every normed linear space E is iso- 
metric to the linear manifold x(Z) ¢ E**®. Identifying E with ~(E), we 
can assert that E © E** in general, and E = E** if E is reflexive (or 
semireflexive). 


THEOREM 6. Every reflexive normed linear space is complete. 


Proof. If Eis reflexive, then E = E**, But E** = (£*)* is complete, 
by Theorem: 1, p. 187. JJ 





8 The set (E) need not be closed. 
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Example 1. Finite-dimensional Euclidean spaces and Hilbert space are 
the simplest examples of reflexive spaces (in fact, for such spaces E = E*). 
This follows from Theorem 2 (cf. Problem 5). 


Example 2. The space cy of all sequences x = (X1,..., X,,.. .) converging 
to zero is an example of a complete nonreflexive space. In fact, as we saw 
in Example 2, p. 185, the conjugate space of cp is the space /, of all absolutely 
summable sequences, which in turn has the space m of all bounded sequences 
(not necessarily converging to zero) as its conjugate space (see Problem 2c). 


Example 3. It can be shown that the space C,,,, of all continuous 
functions on [a, b] is nonreflexive, and even that there is mo normed linear 
space with C;,.»; as its conjugate space. 


Example 4. The space /,, where 1 < p ¥ 2, is an example of a reflexive 
space which does not coincide with its conjugate space. In fact, /* = /,, 
where 


and hence /¥* = /* = I,. 


Problem 1. Let E be Euclidean n-space (real or complex), and let 
€,,...,@, be a basis in E. Let x,,...,x, be the coordinates of a vector 
x € E with respect to the basis e,...,e,, and let f1,..., /” be the coordi- 
nates of a functional f ¢ E* with respect to the dual basis f,,...,/f,. Prove 
that in each of the following pairs, the norm in £* is the norm “induced’’ 
by the corresponding norm in E: 

1/2 
> 


a) Ix = ( Stat). f= ( pan ie 


» txt = (Soar) afn= (Zier) 


whee ee (p,q > 1); 
Pp @ 


c) ||xIl =e xd, I = Ie 


d) Ixl= Sx, fl = sup If". 
k=1 O<k<n 


Problem 2. Let 1, be the normed linear space of all sequences x = 
(x%1,...5X,)+--) With norm 


Wa) 1/p 
ix = (2p) <o (p>). 
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Prove that 
a) If p > 1, the space /* conjugate to /, is isomorphic to the space /,, 
where 
Pog 


b) If p > 1, the general form of a continuous linear functional on J, is 
f (x) = 2wafio 


where x = (%1,...,%»-.J€1,,f= (Ae -oofiee JES 
c) If p = 1, /* is isomorphic to the space m of all bounded sequences 
x = (%,...,%,,...) With norm ||x|| = sup |x; 
k 


Problem 3. Let E be an incomplete normed linear space, with completion 
E. Prove that the conjugate spaces E* and (£)* are isomorphic. 


Hint. Given any f € E*, extend f by continuity to a functional f € (£)*. 
Conversely, given any fe (B*, let f be the restriction of f to E, namely 
the functional f(x) = f(x) for all xe E. Show that fof is the desired 
isomorphism (with ||f|| = |||). 


Problem 4, Let E be an incomplete Euclidean space with the Hilbert 
space H as its completion. Prove that E* and H are isomorphic. 


Problem 5. Particularize Theorem 2 to the case of a finite-dimensional 
Euclidean space. 


Problem 6. Generalize Theorem 2 to the case of a complex Hilbert space. 


Hint, Write xo = f(yo)yo instead of (14). The isomorphism of H and H* 
associating the functional f(x) = (x, x») with x» is then “‘conjugate-linear”’ 
in the sense that af is associated with «xp. 


Problem 7. Let ® be the same countably normed space of “rapidly 
decreasing sequences” as in Problem 12c, p. 172. Find the conjugate space *. 


Hint. Use Problem 6, p. 182. 


Ans. ®* is the space of all functionals f of the form 


i) =D fo 
k=1 


where f= (f,,..-5fy:--.) is any sequence satisfying the condition 
foo} 
Dk" i < 00 
k=1 

for some nonnegative integer n. 
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Problem 8. Let E, E*, and U,, be the same as in Definition 2. Verify 
that the system U,, actually generates a topology b in E* such that the 
linear operations in E* are continuous with respect to b. Prove that if E 
is a normed linear space, then b coincides with the “norm topology” of 
Sec, 19.2, 


Problem 9. Let E be a topological linear space, and let b* be the strong 
topology in E** and x the natural mapping of E into E**. Prove that 
is continuous, 


Hint. The topology b* induces a topology ~~1(b*) in the space £, in 
which a set G © E is said to be open if its image 7(G) is the intersection of 
m(E) with an open subset of (E**, b*). Show that 1 1(b*) is stronger than 
the original topology in E. 


Problem 10. Prove that every closed subspace of a reflexive space is itself 
reflexive. 


20. The Weak Topology and Weak Convergence 


20.1. The weak topology in a topological linear space. Let E be a topo- 
logical linear space, with conjugate space E*. Given any « > 0 and any 
finite set of continuous linear functionals f,,... ,f, € E*, the set 


U = Uy, sre = FLAG <8... ALCO < & (1) 


is open in E and contains the point zero, i.e., U is a neighborhood of zero. 
Let -% be the system of all sets of the form (1). Then .% is a neighborhood 
base at zero, generating a topology in E which is again the topology of a 
topological linear space (the details are left as an exercise). This topology is 
called the weak topology in E. Every subset of E which is open in the weak 
topology is also open in the original topology of EZ, but the converse may 
not be true, i.e., 4 may not be a neighborhood base at zero for the original 
topology in E. In other words, the weak topology is weaker (as defined on 
p. 80) than the original topology, as anticipated by the terminology. 
Clearly, the weak topology in E is the weakest topology + with the property 
that every linear functional continuous with respect to the original topology 
is also continuous with respect to 7. 


20.2. Weak convergence. The weak topology in E may not satisfy the 
first axiom of countability, even in the case where £ is a normed linear space. 
Hence the weak topology cannot in general be described in the language of 
convergent sequences. Nevertheless, the weak topology determines an 
important kind of convergence in E, called weak convergence. By contrast, 
the convergence in E determined by the original topology (by the norm, if 
E is a normed linear space) is called strong convergence. 
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THEOREM 1. A sequence {x,} of elements in a topological linear space 
E is weakly convergent to an element xy € E if and only if the numerical 
sequence {f(x,)} converges to f(x) for every fe E*, i.e., for every 
continuous linear functional f on E. 


Proof. Clearly, there is no loss of generality in assuming that x, = 0. 
Suppose f(x,) +0 for every fe E*. Then, given any “weak neighbor- 
hood” (1), let N, be such that | 7,(x,)| <¢foralln > N,@=1,...,7), 
and let N = max {N,,..., N,}. Then x, € U for all n > N, ie., {x,} 
converges to x9 in the weak topology. 

Conversely, suppose that for each neighborhood (1), there is an inte- 
ger N = N(U) such that x, € Uforalln > N. Then obviously f(x,) — 0 
for any given f € E*, as we see by choosing fto be one of the functionals 
Sv...» f; figuring in the definition of U. Jj 


Specializing to the case where E is a normed linear space, we have 


THEOREM 2. Let {x,} be a weakly convergent sequence of elements in 
a normed linear space E. Then {x,} is bounded, i.e., there is a constant C 


such that 
|x, I] << C (n=1,2,...). 
Proof. Suppose {x,,} is unbounded. Then {x,} is unbounded on every 
closed sphere 
. SLfes €] = if — fol < 2} 
in E*, in the sense that the set of numbers 
LG xn) if e ST fos e], n= 1, 2, oe } 


is unbounded for every S[f, ¢] © E*. In fact, if the sequence {x,} is 
bounded on S[fp, €], then it is also bounded on the sphere 


S[0, €] = {g: llgll < ¢}, 
since if g € S[0, e], then 
fo + ge STho €], 
(g; Xn) oa (ho + 8 Xn) an (fo Xn)s 


where the numbers (fp, X,) are bounded, by the weak convergence of 
{x,}. But if |(g, x,,)| < C for all g € S[0, ¢], then, by the isometry of the 
natural mapping of E into E**, 


1 
Ix,ll = sup I(g, x,)| = — sup |(g, x,)| < 
IIgl<1 € IIgii<e 


so that {x,,} is unbounded, contrary to assumption. It follows that if {x,} 
is unbounded, then {x,} is unbounded on every closed sphere in E*. 
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Next, choosing any closed sphere Sy © E*, we find an integer n, and 
an element f € S, such that 


IF XnJ| > 1. (2) 


Since (f, x) depends continuously on x, the inequality (2) holds for all f 
belonging to some closed sphere S, © Sy. Repeating this argument, we. 
find an integer n, and a closed sphere S, ¢ S, such that 


IF, Xn,)| > 2 


for all fe S,, and so on, where in general there is an integer n, and a 
closed sphere S;, © S,_, such that 


IF, Xn) > 


for all fe S,. At the same time, we can obviously see to it that the 
adius of the sphere S,, approaches zero ask > o. Since E* is complete, 
by Theorem 1, p. 187, it follows from the nested sphere theorem 
(Theorem 2, p. 60) that there is an element f contained in all the 
spheres S,. But then 


IF, Xn,)1 > & (kK = 1,2,...), 


contrary to the assumed weak convergence of the sequence {x,}. [J 


Coro.uary 1. Let {x,} be a sequence of elements in a normed linear 
space E such that the numerical sequence {(f, X,)} is bounded for every 
f¢E*. Then {x,} is bounded. 


Proof. In proving Theorem 2, the weak convergence of {x,} was 
invoked only to infer the boundedness of the sequence {(fo, x,)}. Wl 


Generalizing Corollary 1, we get 


CoroLuary 2. Let M be a weakly bounded subset of a normed linear 
space E, i.e., a subset bounded in the weak topology. Then M is strongly 
bounded, i.e., M is contained in some closed sphere. 


Proof. Suppose M contains a sequence {x,,} such that ||x,|| ~ ©, and 
let M’ be the set of all points x, (n =1,2,...). Since M is weakly 
bounded, so is M’. This means that M’ is ‘‘absorbed” by any weak 
neighborhood of zero, in particular by any neighborhood 


U = xl, < 1, fe £*}, 


in the sense that there is a number « > 0 such that M’ < «U. But then 
If, x,)| < « for all n, which, by Corollary 1, contradicts the assumption 
that |x,|| ~0. 
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CorOLiary 3. A necessary and sufficient condition for a subset M of 
a normed linear space E to be (strongly) bounded is that every continuous 
linear functional f € E* be bounded on M. 


Proof. The necessity follows at once from the inequality 


(AOI < IFT Mell, 


while the sufficiency is an immediate consequence of Corollary 2 and the 
meaning of weak boundedness. § 


A useful test for weak convergence of a sequence is given by 


THEOREM 3. A bounded sequence {x,,} of elements in a normed linear 
space E is weakly convergent to an element x€ E if f(Xn) > f(x) for 
every f € A, where A is any set whose linear hull is everywhere dense in E*. 


Proof. Let ¢ be an arbitrary element of E*, and let {¢,} be a sequence 
of linear combinations of elements of A converging to ¢ (such a sequence 
exists, since A is everywhere dense in E*), Let C be such that 

Ixl<C, [mall < C (@=1,2,..). 
Moreover, given any « > 0, choose k so large that ||p — 9,|| < e (this 
is possible, since », > ¢). Then 


19(%n) — PO) < 1A%n) — PelXn)l + 1 Px%n) — eC)! 
+ leet) — eI 

< Ce + Ce + | u(%n) — eC). (3) 
But 9,(x,) > 9,(x) as n > ©, since g, is a linear combination of 
elements of A, and f(x,) > f(x) for every fe A, by hypothesis. There- 
fore we can make the right-hand side of (3) as small as we please, by 
choosing « sufficiently small and n sufficiently large. It follows that 
(x,) > 9(x) for every » € E*, ie., {x,} converges weakly to x. Jf 


The meaning of weak convergence in various spaces is illustrated by the 
following examples: 


Example 1. Given a finite-dimensional Euclidean space R*, let ¢,..., &, 
be any orthonormal basis in R”, and let {x} be a sequence in R” converging 
weakly to a vector x = (x1,...,X,) € R”. Then 

ee) = e+ @edaxy C= leven): 
i.e., for every j the sequence {x} of components of the vectors x) converges 
to the corresponding component of the limit vector x, But then 


pau x)= [Ec — x9 —x )°>0 
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as k + oo, so that {x} converges strongly to x. On the other hand, strong 
convergence obviously implies weak convergence in any space, Thus we see 
that weak convergence and strong convergence are equivalent concepts in R”. 


Example 2. Let {x} be a (strongly) bounded sequence of elements of Js. 
Then {x} converges weakly to an element x € /, if 


h &™, 6) = Mo (x,e)=x, (G=1,2,...) 
where 
e, = (1,0,0,...), é,g = (0,1,0,...),... 


is an orthonormal basis in /,. This follows from Theorem 3, since linear 
combinations of the elements e,, e,,... are everywhere dense in /,, which 
coincides with its own conjugate space (recall Problem 2a, p. 194). Thus 
weak convergence in /, has the same interpretation in terms of components 
as in R”, i.e., for every j the sequence {x‘)} of components of the vectors 
x) converges to the corresponding component of the limit vector x. How- 
ever, the concepts of weak convergence and strong convergence no longer 
coincide in /,. In fact, although obviously not strongly convergent, the 
sequence of basis vectors {e,} converges weakly to zero. To see this, we note 
that by Theorem 2, p. 188, every continuous linear functional fon /, can be 
written as a scalar product 


I(x) = (%, 4) 
of a variable vector x € /, with a fixed vector a = (a,,...,a,,...)€h, so 
that in particular 
S (Gx) = UM. 


But a, ~0ask — o for every a€/,, and hence f(e,) +0 = f(0). 


Example 3. Consider the space C,,,, of all functions continuous on 
[a, b], and let {x,,(t)} be a sequence of functions in C,,,, converging weakly 
to a function x(t) € C;,,,). Among the continuous linear functionals on C,, »;, 
we have the functionals 8,,,a < t) < b (see Example 5, p. 179), where 3,, 
assigns to each function x(t) € C;,,,) its value at the fixed point fo. Clearly, 


84(%,) > 8,2) 
means that 
Xn(To) aa X(t). 
Hence, if the sequence {x,,(¢)} is weakly convergent, then 
1) {x,,(0} is uniformly bounded on [a, b], i.e., there is a constant C such 
that |x,(t)| < C for alln =1,2,... and all ¢€ [a, b];° 
2) {x,(0} is pointwise convergent on [a, b], i.e., {x,(t)} is a convergent 
numerical sequence for every fixed ¢ € [a, 8]. 


® This follows from Theorem 2. 
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20.3. The weak topology and weak convergence in a conjugate space. Let 
E be a topological linear space, with conjugate space E*, Suppose that 
in Definition 2, p. 190, we require A to be finite instead of bounded. Then 
the resulting topology, generated by the neighborhood base at zero consisting 
of all sets of the form 


U4. = (f:1f(00)] < « for all 4} (4) 


for some number ¢ > 0 and finite set A < E, is called the weak topology in 
E* instead of the strong topology. Clearly, the set (4) can also be written as 


Uy... pale U4. ={fiIf Gv <6... Ifa) < ¢} (4’) 


for some ¢ > 0 and points x,,...,x, € E. Since every finite set A < E is 
bounded, while in general there are bounded infinite sets in E, the weak 
topology in E* is in fact weaker than the strong topology in E* (and in 
general does not coincide with the strong topology). 

The weak topology in E* determines a kind of convergence in E*, called 
weak convergence (of functionals). Weak convergence of functionals plays 
an important role in many problems of functional analysis, in particular in 
the theory of generalized functions (to be discussed in the next section). 
Obviously, a sequence {f,} of functionals f,, € E* is weakly convergent to a 
functional f € E* if and only if {f,,(x)} converges to f(x) for every x € E. 

For weakly convergent sequences of functionals, we have the following 
analogues of Theorems 2 and 3: 


THEOREM 2’. Let {f,,} be a weakly convergent sequence of continuous 
linear functionals on a Banach space E, Then {f,,} is bounded, i.e., there is 
a constant C such that 


Ifpl<C (=1,2,...). 


Proof. The proof is the exact analogue of that of Theorem 2. Note 
that this time we must specify that E is a complete normed linear space 
(i.e., a Banach space). fj 


THEOREM 3’. A bounded sequence { f,,} of continuous linear functionals 
ona Banach space E is weakly convergent to a functional f € E* if f,(x) > 
f(x) for every x € A, where A is any set whose linear hull is everywhere 
dense in E. 


Proof. The exact analogue of the proof of Theorem 3. 
Example. Let E be the space C,,,,; of all functions continuous on [a, 5], 


and consider the functional 
3,,(%) = x(to); (5) 


as in Example 3 above. For simplicity (and without loss of generality), we 
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assume that fp = 0 € (a, b), so that (5) becomes 
3o(x) = x(0). (6) 
Let {f,(} be a sequence of functions continuous on [a, b] such that! 


1) f(t) is positive if |t] < 1 and zero if |t| > Z ; 
n n 


2) [ff dt = 1 for alin = 1,2,..., 
a 
and let 


s(x) = [FOC de. 


Then 8{”) is a continuous linear functional on C,,,,; (recall Example 4, 
p. 179). Moreover, given any function x(t) € Cj,,», we have 


sin Un In 
3) = J? Ax de = ["" $OxO at = x)” F,(0 dt = x) 
for some t € [—1/n, 1/n], by the mean value theorem for integrals, and hence 


30" (x) > x(0) = 30x) 2) 


as n — ©. Thus the sequence of functionals {8{"} converges weakly to the 
functional 5). Suppose we write (6) in the form 


d(x) = J" d(x(4) at, 


in terms of the “delta function”’ 3(t), as in Example 3, p. 124. Then, loosely 
speaking, (7) says that “the generalized function 8(t) is the weak limit of the 
sequence of ordinary functions f,,(t).”” 


20.4. The weak* topology. There are two ways of regarding the space E* 
of continuous linear functionals on a given space E£, either as the space 
conjugate to the original space E, or else as an “original space” in its own 
right, with conjugate space E**. Correspondingly, there are two ways of 
introducing a weak topology into E*, either by using neighborhoods of the 
form (4’), or else by using the values of functionals in E** on the space E*, 
as in Sec. 20.1. Clearly, the two topologies will be the same if and only if 
E is reflexive (why ?). Suppose £ is nonreflexive. Then, to avoid confusion, 
the weak topology determined in E* with the aid of E** will be called simply 
the weak topology, while the topology determined in E* with the aid of E 





10 As an exercise, give an explicit example of such a sequence {/,(t)}. 
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will be called the weak* topology." Clearly, the weak* topology in E* is 
weaker than the weak topology in E*, i.e., the weak* topology has fewer 
open sets than the weak topology. Note that weak convergence as defined 
in Sec. 20.3 now means weak* convergence. 

The following theorem is important in various applications of the 
concept of weak convergence of functionals: 


THEOREM 4. Every bounded sequence { f,,} of functionals in the space E* 
conjugate to a separable normed linear space E contains a weakly* conver- 
gent subsequence. 


Proof. Since E is separable, there is a countable set of points 
X14, Xq,+++5Xn,--. Everywhere dense in E. Suppose the sequence {/,} 
of functionals in E*, i.e., continuous linear functionals on E, is bounded 
(in norm). Then the numerical sequence 


AiOa), fol), gee Sn(Xv, oe 


is bounded, and hence, by the Bolzano-Weierstrass theorem (see p. 101), 
{f,,} contains a subsequence 


(QQ) ¢() (1) 
fi» Qoeerofnoaerere 


such that the numerical sequence 


LPO Ff? Oa), Sxseaie: fs eee 
converges. By the same token, the subsequence {f‘)} in turn contains a 
subsequence 


Paes fo (2) 
Sia Se fig pe 45s 
such that the sequence 

Fi Cats Oe); eatst Pe il 9 . 


converges. Continuing this construction, we get a system of subse- 
quences {f(},k = 1,2,... such that 


1) {f*Y} is a subsequence of {f°} for all k = 1,2,...; 
2) {f} converges at the points x,, x2, ... , Xp. 


Heuce, taking the “diagonal sequence” 


(dy 2 
De Seay eta 
we get a sequence of continuous linear functionals on E such that 
a1) (2) 
Si (xn), f2 (%n)s oe 


11 Read ‘‘weak*”’ as ‘‘weak star.” 
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converges for all n. But then, by Theorem 3’, the sequence 


FPR), LPR), «- 
converges for allxe FE. J 
CorOLLary 1. Every bounded set in the space E* conjugate to a 


separable normed linear space E is relatively countably compact in the 
weak* topology. 


Proof. Animmediate consequence of Theorem 4 and the meaning of 
relative countable compactness (see Sec. 10.4). Jf 


CoroLLary 2. A subset of the space E* conjugate to a separable 
Banach space E is bounded if and only if it is relatively countably compact 
in the weak* topology. 


Proof. An immediate consequence of Theorem 2’ and Corollary 


1. jj 


As we will see in a moment, the word “countably” is superfluous in 
Corollaries 1 and 2. First we need 


THEOREM 5. Given a separable normed linear space E, let S be the 
closed unit sphere in E and S* the closed unit sphere in the conjugate space 
E*, Then the topology induced in S* by the weak* topology in E* is the 
same as that induced by the metric 


io2) 


of, g) al 22" lf — 8g Xa), 
n= 
where {X1,... Xn.» .} is any countable set everywhere dense in S. 


Proof. Clearly, e(/, g) has all the properties of a metric, and moreover 
is invariant under shifts, in the sense that 


ef + hg +h) = Fg). 
Hence we need only verify that 
1) Every “‘open sphere”’ 
Q. = {f: (Ff, 0) < ¢} 


contains the intersection of S* with some weak neighborhood of 
zero in E*; 

2) Every weak neighborhood of zero in E* contains the intersection 
of S with some Q,. 
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Let N be such that 2-* < ¢/2, and consider the weak neighborhood of 
zero 


U = Us,,,. sane = [r: (fal <5 o-- lal <$ 
Then fe S* 1 U implies 
N foo} 
(f= >S2A(fx+ + 2A fxd 
n=1 n=N+1 


€ N roo) 
ee D2 Poy 2 ee: 
2 n=1 n=N+1 


and hence S* NU © Q,. This proves 1), 
To prove 2), this time let 


U=U,, om = TAG vol <8... IK Ym) < 8} 


be any weak neighborhood of zero in E*, where it can clearly be assumed 
that [yi] << 1,-.-, [y¥mll <1. Since {x,,...,%,,...} is everywhere 
dense in S, there are indices m,...,”,, Such that 


peees 


8 
We — nell <5 (k=1,...,m). 


Let 


N=max{m,...,",}, ©=—- 
Then fe S* M Q, implies 
22" fal <s 


and hence 


If *n)l < 2", 
in particular 


If; Xn] < 2% < 2% =5. 


Therefore fe S* A Q, implies 
3 
1s Yad < Ns Xd FNP Ya = Fadl <5 PFT We — nell < 8s 
so that S*NQ,¢ U. §j 
We can now drop the word “countably”’ in Corollaries 1 and 2: 


Coro.ary 1’. Every bounded set in the space E* conjugate to a separ- 
able normed linear space E is relatively compact in the weak* topology. 
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Proof. Use Theorem 5 and the fact that compactness and countable 
compactness are equivalent concepts in a metric space (see Sec. 11.2.). Ij 


CoroLiary 2’. A subset of the space E* conjugate to a separable 


Banach space E is bounded if and only if it is relatively compact in the weak* 
topology. 


Proof. Identical with that of Corollary 1’. Jj 
Finally we prove 


THEOREM 6. Every closed sphere in the space (E*, b) conjugate to a 
separable normed linear space E is compact in the weak* topology. 


Proof. Every closed sphere in the space (E*, b) is closed in the weak* 
topology. In fact, since a shift in E* carries every closed set (in the 
weak* topology) into another closed set, we need only prove the assertion 
for every sphere of the form 


S,= Fllfl < ¢}. 


Suppose fy ¢ S,. Then, by the definition of the norm of the functional 
Jo, there is an element x € £ such that ||x|| = 1 and 


f(y =a>e 


U = {ff(x) > Ha + ©} 
is a weak* neighborhood of f, containing no elements of S,. Therefore 
S, is closed in the weak* topology, and hence compact in the weak* 
topology, by Corollary 1’. Jj 


But then the set 


Remark. Theorem 6 is a special case of the following more general 
theorem, which will not be proved here: Every bounded subset of the space 
(E*, b) conjugate to a locally convex topological linear space E is relatively 
compact in the weak* topology. 


Problem 1. Given a topological linear space E, suppose E has sufficiently 
many continuous linear functionals. Prove that E is a Hausdorff space, when 


equipped with the weak topology. 
Problem 2. Let {x,} be a sequence of elements in a Hilbert space H such 
that 


1) {x,} converges weakly to an element x € H; 
2) IXall Fe || xl asn— ©. 


Prove that {x,,} converges strongly to x, ie., |x, — x|| >~0 asn— o, 
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Problem 3. Prove that the conclusion of the preceding problem remains 
valid if the condition 2) is replaced by either of the following conditions: 


2’) [Xnl| < [|xl| for all n; 


2") lim ||x_l| < [ll 
nro 


Problem 4. Let H be a (separable) Hilbert space and M a bounded subset 
of H. Prove that the topology in M induced by the weak topology in H can 
be specified by a metric. 


Problem 5. Prove that every closed convex subset of a Hilbert space H 
is closed in the weak topology (so that, in particular, every closed linear 
subspace of H is weakly closed). Give an example of a closed set in H which 
is not weakly closed. 


Problem 6. Show that the two conditions in Example 3, p. 199 are 
sufficient as well as necessary for weak convergence of a sequence {x,,(f)} in 
Cia,0 Give an example of a weakly convergent sequence in C,, ,, which is 
not strongly convergent. 


21. Generalized Functions 


21.1. Preliminary remarks. The degree of generality attaching to the 
notion of “function”? varies from problem to problem. Some problems 
involve continuous functions, others involve functions differentiable one or 
more times, and so on. However, there are a number of situations in which 
the classical notion of a function turns out to be inadequate, even when 
understood in the most general sense (i.e., as an arbitrary rule f assigning a 
number f(x) to each element x in the domain of definition of f). Here are 
two such cases: 


1) A linear mass distribution can be conveniently characterized by giving 
the density of the distribution. However, no “‘ordinary”’ function can 
specify the density corresponding to one or more points with positive 
mass. 

2) In many problems, situations arise in which various mathematical 
operations cannot be carried out. For example, a function with no 
derivative (at certain, possibly all, points) cannot be differentiated if 
the derivative is interpreted in the usual way, as an “ordinary” 
function. Of course, such difficulties can be avoided without relin- 
quishing classical definitions, by suitably restricting the class of 
“admissible functions,’ for example, by considering only analytic 
functions. However, restricting the class of admissible functions in 
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this way is often quite undesirable. Fortunately, it turns out that 
difficulties of this kind can be overcome, and just as successfully at 
that, by enlarging (rather than restricting) the class of admissible 
functions, i.e., by introducing the notion of a “generalized function,” 
not encountered in classical analysis. In doing so, a key role will be 
played by the concept of a conjugate space, considered earlier in this 
chapter. 


Remark. It cannot be emphasized too strongly that the introduction of 
generalized functions is motivated by the need to solve perfectly concrete 
problems of analysis, and not merely by a desire to see how far the notion 
of function can be pushed. 


Before going into details, we indicate the basic idea behind the theory 
of generalized functions. Let f be a fixed function on the real line, integrable 
on every finite interval, and let » be any continuous function vanishing outside 
some finite interval (such a function ¢ is said to be finite’). Suppose each 
@ is assigned the number 


(f.9) = [© S90) ax, (1) 


involving the given function f, where the integration is in effect only over a 
finite interval, because of the finiteness of ». In other words, the function 
f can be regarded as a functional (a linear functional, because of the basic 
properties of the integral) defined on some space K of finite functions. 
However, there are many other linear functionals on K besides functionals 
of the form (1). For example, by assigning each function ¢ its value at the 
point x = 0, we get a linear functional which cannot be represented in the 
form (1). In this sense, the functions f can be regarded as part of a much 
larger set, namely the set of all possible linear functionals on K. The space 
K of “test functions” » can be chosen in various ways. For example, K 
might consist of all continuous finite functions, as above. However, as will 
soon be apparent, it makes sense to require the test functions vo satisfy rather 
stringent smoothness conditions (besides being continuous and finite). 


21.2. The test space and test functions. Generalized functions. Turning 
now to details, let K be the set of all finite functions @ on (—00,0o) with 
continuous derivatives of all orders (equivalently, the set of all infinitely 
differentiable functions), where every function 9 € K, being finite, vanishes 
outside some interval depending on the choice of ». Clearly K is a linear 


22 Do not confuse the notion of a finite function (which vanishes outside some finite 
interval) with the notion of a bounded function (whose range is contained in some finite 
interval). Finite functions are often called ‘‘functions of finite (or compact) support.” 
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space, when equipped with the usual operations of addition of functions and 
multiplication of functions by numbers. Although the space K is not 
normable, there is a natural way of introducing the notion of convergence in K: 


DEFINITION 1. A sequence {¢,,} of functions in K is said to converge to 
a function 9 € K if 
1) There exists an interval outside which all the functions 9, vanish; 
2) The sequence {¢*} of derivatives of order k converges uniformly 
on this interval to 9) for every k =0,1,2,... .8 


The linear space K equipped with this notion of convergence is called the 
test space (or fundamental space), and the functions in K are called test 
functions (or fundamental functions). 


DEFINITION 2. Every continuous linear functional T(@) on the test 
space K is called a generalized function on (— «©, ©), where continuity of 
T() means that 9, > 9 in K implies T(9,) > T(q). 


Let f(x) be a locally integrable function, i.e., a function integrable on 
every finite interval. Then f(x) generates a generalized function via the 
expression 


79) =(f.9) = [° fe 9) ax, (2) 


which is clearly a continuous linear functional on K. Generalized functions 
of this type will be called regular, and all other generalized functions, i.e., 
those not representable in the form (2), will be called singular. The following 
are all examples of singular generalized functions: 


Example 1. The “delta function” 
T(¢) = ¢(0) (3) 
is a continuous linear functional on K, i.e., a generalized function in the 
sense of Definition 2. This functional can be written in the form 


T(¢) = |” S02) 9(2) dx, @) 


where 8(x) is a “‘fictitious’’ function,44 equal to zero everywhere except at 
x = 0 and such that 


i) ” §(x) dx = 1 
—-—a 
19 As always, 9) = 9,, 9 = 9. 


14 The term ‘‘delta function” will be applied to both the generalized function T(@) and 
the fictitious function 5(x) generating T(¢) via the representation (4). 
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(these properties are of course paradoxical), since then we have, purely 
formally, 


T(9) = [° 3x) oa) dx = 9(0) [* 8(%) dx = 9(0). 


The advantage of regarding the delta function as a functional on the test 
space K rather than on the space C,,,, as in Example 3, p. 124 will soon 
be apparent. 


Example 2, Generalizing (3) and (4), we can write the functional 


T() = 9(a) (3’) 
in the form 


T(9) = [% 3(x — a)9(x) dx, 4") 


in terms of the “shifted delta function” 3(x — a). 


21.3. Operations on generalized functions. Addition of generalized func- 
tions and multiplication of generalized functions by numbers are defined 
in the same way as for linear functionals in general, i.e., by the obvious 
analogue of Definition 1, p. 183 (with ¢ and K playing the roles of x and £). 
In the case of regular generalized functions, these are just the operations 
associated with the corresponding operations for “ordinary” functions. More 
exactly, if 


T(9) = J” fde@)dx, — 7,9) = [® a(x) 9x) dx, 
where f and g are locally integrable and ¢ € K, then clearly 
(T, + T,)(9) = T,(¢) + T.(¢) = Ty45(9) 


(a7;,)(~) = «T;(9) = Ta+(¢) 


and 


for any number «. 


DEFINITION 3. A sequence of generalized functions {T,,} is said to con- 
verge to a generalized function T if T,,(9) > T(¢) for every 9 € K. The 
space of generalized functions equipped with this notion of convergence 
is denoted by K*. 


Remark. In other words, convergence of generalized functions is just 
weak* convergence of continuous linear functionals on K. 


We will often denote a generalized function by the symbol f, as if a 
representation of the form 


(4.9) =[" fe) dx (5) 
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existed, even in the case where the generalized function is singular. Let f be 
a regular generalized function, and let « = a(x) be an infinitely differentiable 
“ordinary’’ function. Then (5) implies 


(fe) =f" ef @)e(x) ax 
=[" f@)a@)el2) dx = (f, 9), 


where ap obviously belongs to K. Carrying this over to the singular case, 
we get 


DEFINITION 4. The product af of an infinitely differentiable function « 
and a generalized function f is the functional defined by the formula 


(af, ¢) = GU, #9). (6) 


Remark. It follows from (6) that the functional «fis linear and continuous, 
and hence itself a generalized function. 


Again let T be a regular generalized function of the form 


T(#) =[" f@)e(2) dx, (5) 


and suppose the derivative f’ exists and is locally integrable. Then it is 
natural to define the derivative of T as the functional 


aT = f° Fedo) dx. @) 


Integrating (7) by parts and using the fact that every test function ¢ vanishes 
outside some finite interval, we find at once that 


To = —[°, So¥@) ds, (8) 


thereby obtaining an expression for dT/dx which does not involve the deri- 
vative of f. Carrying this over to the singular case, we get 


DEFINITION 5. The derivative dT/dx of a generalized function T is the 
functional defined by the formula 


aT 
a (9) = —T(g’). (9) 


Remark 1. The functional (9) is obviously linear and continuous, and 
hence itself a generalized function. Second, third and higher-order derivatives 
are defined in the same way. 
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Remark 2. If a generalized function is denoted by the symbol f, as in (6), 
then its derivative is denoted by f’, and (9) takes the form 


f9)= —-4 9’) (9’) 
It is an immediate consequence of Definition 5 that 
1) Every generalized function has derivatives of all orders; 
2) If a sequence of generalized functions {/,} converges to a generalized 
function f (in the sense of Definition 3), then the sequence of deri- 
vatives {f;} converges to the derivative f’ of the limit function. 


Example 1. If fis a regular generalized function whose derivative exists 
and is locally integrable (in particular, continuous or piecewise continuous), 
then the derivative of fas a generalized function coincides with its derivative 
in the ordinary sense. In fact, integrating (8) by parts, we get back (7). 


Example 2. As in Example 1, p. 208, consider the delta function 


T(¢) = J” 3(x)9(x) dx. 
It follows from Definition 5 that 
dT cs , ' 
© (9) = ~]2.8@)9@) dx = —9'@. 


Example 3. Consider the ‘‘step function”’ 


0 if x<0, 
0) =| (10) 


1 if x>0, 
defining the linear functional 


T(9) = [= fle(x)dx = J” ox) dx. 
It follows from Definition 5 that 
aT oO, 
5%) = —]."9') dx = 4), 


since @ vanishes at infinity. Hence the derivative of (10) is just the delta 
function 3(x). 


21.4. Differential equations and generalized functions. The development 
of the theory of generalized functions was to a large extent motivated by 


5 Equivalently, every convergent series of generalized functions can be differentiated 
term by term any number of times. 
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problems involving differential equations, particularly partial differential 
equations. We now discuss a few simple ideas concerning generalized 
functions and ordinary differential equations. The application of generalized 
functions to partial differential equations is a subject lying beyond the scope 
of this book.1® 


Lemma |. A test function 9 can be represented as the derivative of 
another test function 9, if and only if 


LP go) dx =0. (11) 


Proof. If ¢o(x) = 9,(x), where ¢, is a test function, then 


FP eo) dx = e(x)[° = 0. 
Conversely, 


ex) = [* got) at 


is an infinitely differentiable function, with derivative (x), and in fact 
a finite function if (11) holds, since then 9 and ¢, vanish outside the 
same interval. jj 


LEMMA 2. Let 9, be a fixed test function such that 


[Pe ax = 1. (12) 
Then an arbitrary test function @ can be represented in the form 


e = Po + CF, 


where c is a constant and q is a test function which is the derivative of 
another test function. 


Proof. Let 


c=[Pe)dx, gol) =o) — 910/90) ae 
Then 


es (x) dx = 0, 


and the proof follows from Lemma 1. ff 


46 See e.g., A. Friedman, Generalized Functions and Partial Differential Equations, 
Prentice-Hall, Inc., Englewood Cliffs, N.J. (1963). A key role in the development of the 
theory of generalized functions was played by the pioneer work of L. Schwartz, Théorie 
des Distributions, Hermann et Cie., Paris, Volume I (1957), Volume 2 (1959). 
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THEOREM |. Every solution of the differential equation 
eae (13) 
(in the space K* of generalized functions) is a constant. 
Proof. Equation (13) means that 
O, 9) =, —¢/) =0 (14) 


for every 9 € K. This determines the value of the functional 


(, 9) = [° yee) ax 


for every function in the space K’  K of all test functions which are 
derivatives of other test functions. In fact, 


(Y; Po) = 0 


for every 9 € K’. Let ¢ be an arbitrary test function. By Lemma 2, 
@ = ~o + ce, where 9 € K’ and 9, is a fixed test function satisfying 
the condition (12). We are free to give (y, ,) any value at all, without 
violating (14). Let 
(Y, Pi) = & = const. 
Then 
(y, 9) = ( Po + C41) = (Ys Po) + CC, 1) = aC = const, 


and moreover y satisfies the differential equation (13). In fact, pe K 
implies —' € K’ and hence 
O,9)=%—-9)=0. Fl 
CorOLiary. If two generalized functions f and g have the same deriva- 
tive, then f = g + const. 


Proof. Obvious, since (f— g)’ =0. ff 


THEOREM 2. Given any generalized function f, there is another 
generalized function y satisfying the differential equation 


y =f). (15) 


Proof. Any generalized function satisfying (15) is called an anti- 
derivative of f. Equation (15) means that 


(YY 2=,-9) =(£ 9) = (4 vo ar) (16) 


for every ¢ € K. This determines the value of the functional (y, 9) for 
every function in the space K’< K of all test functions which are 
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derivatives of other test functions. In fact, 


(Ys 0) = ( CO ar) 


for every ~) € K’. Let © be an arbitrary test function. By Lemma 2, 
~ = %) + c%,, where 9, € K’ and 4, is a fixed test function satisfving 
(12). We are free to give (y, 9,) any value at all, without violating (16). 
Let 

(y, 9) = & = const. 


Then y satisfies the differential equation (15). In fact, ¢ € K implies 
—’ € K’ and hence 


WA) = 0, -9) = (4 [.eo at) =(f9. I 


CorOLLaRY. Any two antiderivatives of a generalized function f differ 
only by a constant. 


Proof. Obvious by construction or from the corollary to Theorem 


an | 


21.5. Further developments. We now sketch some of the many extensions 
and modifications of the notion of generalized functions. 


a) Generalized functions of several variables. Let K” be the set of all 
functions 9(x,,...,X,) of n variables with partial derivatives of all orders 
with respect to all arguments, such that every » € K” vanishes outside some 
parallelepiped 

a,< x, <b; (G=1,...,”) (17) 
in n-space. Then K” is a linear space, with addition of functions and multi- 
plication of functions by numbers defined in the usual way. We introduce 
convergence in K” by the natural generalization of Definition 1, ie., a 
sequence {¢,} of functions in K” is said to converge to a function 9 € K” if 


1) There exists a parallelepiped (17) outside which all the functions 9, 
vanish; 
2) The sequence of partial derivatives 


O Px “0 


converges uniformly on this parallelepiped to the partial derivative 
ec eee 
axis ++ ax" 

for all r, a, ..., @y. 
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Every continuous linear functional on K” is then called a generalized function 
of n variables, and moreover every “‘ordinary”’ function f(x, ...,X,) of n 
variables integrable on every parallelepiped can be regarded as a generalized 
function, in fact the one giving rise to the functional 


(f= [feo ax, 
where 
X= OG aes %)s dx = dx,:++ dx, 


and the integral is over all of n-space. Convergence of generalized functions 
is defined by the obvious analogue of Definition 3, while partial derivatives 
of generalized functions are defined by the formula 


of (x) a 9(x) 
ee: A) = (1) ( £0), ST }. 
(sa OX ss ) a (% ) Oxytes- =) 
It is clear that every generalized function of n variables has partial derivatives 
of all orders. 


b) Complex generalized functions. So far we have only considered real 
generalized functions. Suppose the test functions are now allowed to be 
complex-valued, but still finite and infinitely differentiable. Then every 
continuous linear functional on the corresponding test space K is called a 
complex generalized function. If (f, ¢) is such a functional, then 


(f, ap) = af, #). 


We can also consider conjugate-linear functionals on K, satisfying the 
condition (cf. p. 123) 
(f, «9) = a, 9), 


where the overbar denotes the complex conjugate. If f is an “ordinary” 
complex-valued function on the line, there are two natural ways of associating 
linear functionals with f, i.e., 


(Ban =f* fede) ax, 


(6 o2= [" FR) ax, 


and two natural ways of associating conjugate-linear functionals with /: 
(fe =[° S90) ax, 


(6 oa= [® FO) ax. 
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Each of these four choices corresponds to a possible way of embedding the 
space of “ordinary’’ functions in the space of generalized functions. Opera- 
tions on complex generalized functions are defined by analogy with the real 
case. 


c) Generalized functions on the circle. Sometimes it is convenient to 
consider generalized functions defined on a bounded set. As a simple example, 
consider generalized functions on a circle C, choosing the test space Kg to 
be the set of all infinitely differentiable functions on C, equipped with the 
usual operations of addition of functions and multiplication of functions by 
numbers. (Note that the test functions are now automatically finite, since C 
is bounded.) Then every continuous linear functional on Kg is called a 
generalized function on the circle. Every “ordinary”? function on C can be 
regarded as a periodic function on the line. In the same way, we regard 
every generalized function on the circle as a periodic generalized function, 
where a generalized function f is said to be periodic, with period a, if 


(f(x), 9(% — a) = (Fl), eC) 
for every test function 9 € K. 


d) Other test spaces. There are many possible choices of the test space 
other than the space of infinitely differentiable finite functions. For example, 
we can choose the test space to be the somewhat larger space S,, of all 
infinitely differentiable functions which, together with all their derivatives, 
approach zero faster than any power of 1/|x|. More exactly, a function 9 
belongs to S,, if and only if, given any p,q = 0,1,2,..., there is a constant 
C,, (depending on p, q and ¢) such that!’ 

[xP p(x) << C,, (—-o <x < o), 
A sequence {¢,,} of functions in S,, is said to converge to afunction 9 € S,, if 


1) The sequence {¢‘*)} converges uniformly to 9 on every finite interval ; 
2) The constants C,, in the inequalities 
Ix? p'(%) < Cog 
can be chosen independently of n. 


There are somewhat fewer continuous linear functionals on S,, than on K. 
For example, the function f(x) = e* corresponds to a continuous linear 
functional (f, ¢) on K but not on S,. 


Remark. As the theory of generalized functions has evolved, it has 
become apparent that there is no need to commit oneself once and for all 
to any definite choice of test space. Rather it is best to choose a test space 


17 As an exercise, verify that this is the same space S,, as in Problem 12b, p. 172 . 
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which is most suitable for solving the class of problems at hand. In general, 
the smaller the test space, the greater the freedom in carrying out various 
analytical operations (differentiation, passage to the limit, etc.) and the larger 
the number of continuous linear functionals on the space (why?). However, 
we must make sure not to make the test space too small, i.e., we must require 
not only that the test functions be “sufficiently smooth”’ but also that there be 
“sufficiently many’’ of them (in the sense of Problem 9) to allow us to “tell 
ordinary functions'® apart.” 


Problem 1. In the test space K of all infinitely differentiable finite func- 
tions, let % be the neighborhood base at zero consisting of all sets of the 
form 


Oye. te = {P59 © K, LC) < Yo(x), --- 5 19!) < Ya(x) for all x} 


Yo: 
for some positive functions Yo,... , Y, continuous on (— ©, 0). Prove that 
the topology generated in K by .% leads to the same kind of convergence 
in K as in Definition 1. 


Comment. There are other topologies in K leading to the same conver- 
gence. 


Problem 2. Let K be the test space of all infinitely differentiable finite 
functions, and let K,, be the subspace of K consisting of all functions g € K 
vanishing outside the interval [—m, m]. We can make K,, into a countably 
normed space by setting 


Il, = sup |@(x)| (n= 0,1,2,...) 

0<k<n 

lal<m 
(cf. Problem 12a, p. 171). Verify that the topology induced in K,, by the 
system of norms ||-||,, coincides with the topology induced in K,, by the 
topology of Problem 1. Verify that the convergence in K,, induced in K,, 
by the norms ||-||,, coincides with the convergence induced in K,, by the 
convergence in Definition 1. Clearly K) © Kp © +--+: C K, © +--+, and 


K=UkK,,. 
m=1 
Show that a set Q © Kis bounded with respect to the topology in K if and 
only if there is an integer m such that Q is a bounded subset of the countably 
normed space Kp. 


Problem 3. Let K and K,, be the same as in Problem 2, and let T be a 
linear functional on K. Prove that the following four conditions are 





18 More exactly, regular generalized functions. 
Y> reg & 
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equivalent: 


a) T is continuous with respect to the topology of the space K; 

b) T is bounded on every bounded subset Q < K; 

c) If 9, ¢€K and 9,0, then 7(9,)-+0 (provided convergence of 
sequences is defined as in Definition 1); 

d) The restriction T,, of the functional T to the space K,, © K is a 
continuous functional on K,, for every m= 1,2,... 


Problem 4, Let 
o | 
T(9) = |® = o(x) dx (18) 
wee 


for every 9 in the test space K. Prove that 7(9) is a generalized function 
if the integral is understood in the sense of the Cauchy principal value. 
Hint. If ¢ vanishes outside the interval [a, b], write 


ik p(x) — (0) » 20) 1. 
oe x x 


ile Lae dx + [ 
—0O xX a 

Problem 5. Prove that the delta function and its derivative are singular 
generalized functions. Prove that the same is true of (18). 


Problem 6. Prove that addition of two generalized functions and 
multiplication of a generalized function by an infinitely differentiable 
function « (in particular, a constant) are continuous operations in the sense 
that f, > f, fr>f implies f, +f, >f+f afn—- af. Prove that there 
is no way of similarly defining a continuous product of two generalized 
functions, unless the functions are regular, in which case the appropriate 
definition is T7;, = 7;T, where 


TA) = [* fore) dx, — T@) = J® ee) ax, 
T,,(9) = Tae S(%)g(x) 9(x) dx. 


Problem 7. Let f be a piecewise continuous function on (— ©, «), 
differentiable everywhere except at the points x,, %2,...,%,,..., where it 
has jumps 

f+) —f,—-O=h,  (=1,2,.... 
Prove that the generalized derivative of f (i.e., the derivative of f regarded as 
a generalized function) is the sum of its ordinary derivative (at the points 
where it exists) and the generalized function 


g(x) = ¥ hyBlx — x4) 


Comment. Note that (g, ¢) reduces to a finite sum for every test function 9. 
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Problem 8. Find the generalized derivative of the function of period 2x 
equal to 


35 if O<x<7, 
f(x) = { 0 if x=0, (19) 
ame if —x~<x<0 


in the interval [—7, 7]. 
Ans. f(x) = —4 + 7% > 6(x — 2nz). 
Comment. The function (19) is the sum of the trigonometric series 


sin nx ; (20) 
n=1 Hn 


Differentiating (20) term by term, we get the divergent series 


foo} 
> cos nx. 
n=l 


Hence the concept of a generalized function allows us to ascribe a definite 
meaning to a series that diverges in the ordinary sense. The same can be 
done for many divergent integrals (like those encountered in quantum field 
theory and other branches of theoretical physics). 


Problem 9, Prove that the test space K of all infinitely differentiable finite 
functions has “‘sufficiently many” functions in the sense that, given any two 
distinct continuous functions f, and f, there exists a function ¢ € K such that 


[2 AC oe) dx # f° AEG dx. 


Hint. Since f(x) =fi(x) — fo(x) 4 0, there is a point xX» such that 
Ff (Xo) # 0, and hence an interval [«, 8] in which f(x) does not change sign. 
Let 
eo Ma—a)? 9 1/ (a By? if «<x<Q, 
9(x) = é 


Then ¢ € K and 


otherwise. 


[2 F900) dx = [PF@9@) dx 40. 


Comment. This result can be extended to functions more general than 
continuous functions, with the help of the concept of the Lebesgue integral 
(introduced in Sec. 29). 
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Problem 10. Consider the homogeneous system of n linear differential 
equations 


y= Sandy (i=1,...,n) (21) 


in n unknowns yy, ... , Yn, where the a,, are infinitely differentiable functions. 
Prove that every solution of (21) in the class K* of generalized functions is a 
set of “ordinary” (in fact, infinitely differentiable) functions. 


Comment. This can be expressed by saying that every “generalized 
solution’ of (21) is also a “classical solution.” 

Problem 11. Consider the nonhomogeneous system of n linear differential 
equations 


Vi = 3 anl)ye + f(x) (i= 1,...,n), (22) 


where the a,, are infinitely differentiable functions and the f; are generalized 
functions. Prove that (22) has a generalized solution, which is unique to 
within a solution of the homogeneous system (21). What happens if the fj 
are ‘“‘ordinary”’ functions? 


Problem 12, Interpret 
f(x) = S cos nx 
as a periodic generalized function. i 
Hint. Recall Problem 8. 
Problem 13. Show that S,, becomes a countably normed space when 
equipped with the system of norms 
Ioln= 2 sup [1 + Ile"). 


ptq=n —o<2<0 
0<i<p 
0<s<a 
Prove that convergence of sequences in this countably normed space is 
equivalent to convergence of sequences in S,, as defined on p. 216. 


6 


LINEAR OPERATORS 


22. Basic Concepts 


22.1. Definitions and examples. Given two topological linear spaces E and 
E,, any mapping 
y = Ax (xe E,yeE,) 


of a subset of E (possibly £ itself) into E, is called an operator (from E to 
E,). The operator A is said to be /inear if 


A(ax, + Bx_) = aAx, + BAX. 


Let D, be the set of all x € E for which A is defined. Then D, is called the 
domain (of definition) of the operator A. Although in general D, need not 
equal E, we will always assume that D, is a linear subspace of E, i.e., that 
x, y € D4 implies ax + By € Dy for all « and 8. 

The operator A is said to be continuous at the point x. € D if, given any 
neighborhood V of the point yp = AXp, there is a neighborhood U of the point 
Xq such that Ax eV for all x eU MA Dy. We say that the operator A is 
continuous if it is continuous at every point x) € Dy. 


Remark I. Suppose E and £, are normed linear spaces. Then it is easy 
to see that A is continuous if and only if, given any « > 0, there is a 3 > 0 
such that 
|x’ — x" <8 (x’, x" € Dy) 
implies 
|Ax’ — Ax"|| <e. 
221 


222 LINEAR OPERATORS CHAP. 6 


Remark 2, In the case where £, is the real line, the concept of a linear 
operator reduces to that of a linear functional, and the definition of continuity 
reduces to that given on p. 175. As we will see below, much of the theory 
of linear functionals carries over in a straightforward way to the case of 
linear operators. 


Example 1. Given a topological linear space E, let Ix = x for all x € E. 
Then J is a continuous linear operator, called the identity (or unit) operator, 
carrying each element of E into itself. 


Example 2. Let E and E, be arbitrary topological linear spaces, and let 
Ox = 0 for all x € E, where 0 is the zero element of the space E,. Then O 
is a continuous linear operator, called the zero operator. 


Example 3. Suppose A is a linear operator mapping the m-dimensional 
space R™ with basis e,,..., @m into the n-dimensional space R” with basis 
€j,-++»@,. If x is an arbitrary vector in R™, then 


m 
x => xe; 


j=1 
and hence, by the linearity of A, : 
y = Ax =) x,Ae;. 
j=l 


Thus the operator A is completely determined once we know the vectors in 
R” into which A carries the basis vectors e,...,@,. Suppose we expand 
each vector Ae, with respect to the basis e|,...,e,, obtaining 


»&n> 


n 
Aes = ¥ a; ,€;. 
t=1 
Then 
n m m n 
y=, > Vili = DY XjAls = DX; DY aye 
t=1 j=1 j=l t=1 
and hence 
m 
Vi =D ai5X 55 
j= 
ie., the operator A is completely determined by the matrix ||a;;|| made up of 
the coefficients a,;. 


Example 4. Let H, be any subspace of a Hilbert space H, and let 
H, = H © H, be the orthogonal complement of H,, so that an arbitrary 
element / € H has a unique representation of the form 


h=h+h, (hy € Ay, he € He) 
(see Theorem 14, p. 158). Let 
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Then P is a continuous linear operator, called a projection operator. Inter- 
preted geometrically, P “projects the whole space H onto the subspace Hj.” 


22.2. Continuity and boundedness. A linear operator mapping E into Ey 
is said to be bounded if it maps every bounded subset of E into a bounded 
subset of E,. The operator analogue of Theorem 3, p. 176 for functionals is 
given by 


THEOREM 1. A necessary condition for a linear operator A to be con- 
tinuous on a topological linear space E is that A be bounded. The condition 
is also sufficient if E satisfies the first axiom of countability. 


Proof. To prove the necessity, suppose A is continuous and suppose 
there is a bounded set M in E, whose image AM = {y:y = Ax, x € M} 
is unbounded in £;. Then there is a neighborhood V of zero in E, such 
that none of the sets 

LGM? reas 

n 
is contained in V. Hence there is a sequence {x,} of elements of M such 
that none of the elements 
ee (n= 1,2,...) 
n 


belongs to V. But then the sequence 


4 


converges to zero in E (recall Problem 6b, p. 170), while the sequence 


bo 


fails to converge to zero in E,, contrary to the assumption that A is 
continuous. 
As for the sufficiency, let {U,,} be a countable neighborhood base at 
zero in E such that 
U,> U,> eee D U, > see 


If A fails to be continuous on FE, then, by the operator analogue of 
Theorem 1, p. 175,' there is a neighborhood V ofzeroin E, and a sequence 
{x,} in E such that 


ee ey HS a 
n 





1 As an exercise, state and prove this analogue. 
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The sequence {nx,,} is bounded in E (and even converges to zero), while 
the sequence {nAx,} is unbounded in £,, since it is contained in none 
of the sets nV. But then A fails to be bounded on the bounded set 
{%1, %2,...,X%q>---}, contrary to hypothesis. Jf 


Next we consider the operator analogues of Definition 2 and Theorem 4, 
p. 177. Suppose E and E, are both normed linear spaces, so that in particular, 
E satisfies the first axiom of countability. Then, by Theorem 1, a linear 
operator A mapping E into E, is continuous if and only if it is bounded. 
But by a bounded set in a normed linear space we mean a set contained in 
some closed sphere ||x|| < C. Therefore a linear operator A on .a normed 
linear space is bounded (and hence continuous) if and only if it is bounded 
on every closed sphere ||x|| < C, or equivalently on the closed unit sphere 
|x| < 1, because of the linearity of A. In other words, A is bounded if 


and only if the number 
|All ue Axl (1) 


is finite. 


DEFINITION. Given a bounded linear operator mapping a normed linear 
space E into another normed linear space E,, the number (1), equal to the 
least upper bound of ||Ax|| on the closed unit sphere ||x|| < 1, is called the 
norm of A. 


THEOREM 2. The norm ||A|| has the following two properties: 
Ax 

(2) 
xl” 


Axil < |All Iixll for all x © E. @) 
Proof. Clearly, 
|A\ =e. \| Axl] = hee Ax] 





|All = eup: 


(why ?). But the set of all vectors in E of norm 1 coincides with the set of 
all vectors 


part (x €E,x A 0), (4) 
{|x| 
and hence 
All = sup [Axl = sup aC x || PAxll 
xf] «#0 {xl 














which proves (2). a since the vectors (4) all have norm 1, it 
follows from (1) that 
I Axl 


Aral I> 


which implies (3) for x 4 0. The validity of (3) for x = Ois obvious. ff 


<|Al (~ek,x 40), 
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22.3. Sums and products of operators. Let A and B be two operators from 
one topological linear space E to another topological linear space E,. Then 
by the sum of A and B, denoted by A + B, we mean the operator assigning 


the element y= Ax + BxeE, 


to each x € E. The domain Dg of the sum C = A + Bis just the intersection 
D4 CO Dg of the domains of A and B. It is clear that C is linear if A and B 
are linear, and continuous if A and B are continuous. Let E and E, be normed 
linear spaces, and suppose A and B are bounded operators. Then C = A + B 
is also bounded, with norm 

[Cll < |All + [BI 


since, by Theorem 2 and Problem 10, 
|Cx|| = ]Ax + Bx] < |]|Axl] + [Bxl] < (All + PBI) Ill 


for every x € E. 

Next, given three topological linear spaces E, E, and E,, let A be an 
operator from E to E, and B an operator from E, to E,. Then by the product 
of A and B, denoted by BA (in that order), we mean the operator assigning 


the element z = B(Ax) € E, 


to each x € E. The domain Dg of the product C = BA consists of those 
x € Dy for which Ax € Dg. Again it is clear that C is linear if A and B are 
linear, and continuous if A and Bare continuous. Let E, E, and E, be normed 
linear spaces, and suppose A and B are bounded operators. Then C = BA is 
also bounded, with norm 
Cll < |All TIL, 
since 
|x|] = |] BCAx)I| < [Bll Ax < | Bll All lel. 


Remark I. Sums and products of three or more operators are defined 
in the natural way, e.g., 


CBA = C(BA) = (CB)A, 
A+B+C=A+(B4+Q=(A+B)4+C. 


Note that addition of operators is associative and commutative, while 
multiplication of operators is associative but in general not commutative 
(give an example where AB 4 BA). 


Remark 2. By the product «A of the operator A and the number «@ is 
meant the operator assigning the element «Ax to each x ¢ E. Let P(E, E,) 
be the set of all continuous linear operators mapping E into E,. Then #(E, E,) 
is clearly a linear space when equipped with the operations of addition of 
operators and multiplication of operators by numbers. 
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Problem 1. Prove that every linear operator on a finite-dimensional space 
is automatically continuous (cf. Problem 2, p. 181). 


Problem 2. Let A be a linear operator mapping m-space R™ into n-space 
R”. Prove that the image of R™, i.e., the set {y:y = Ax, x € R™}, has di- 
mension no greater than m. 


Problem 3. Let C,,,, be the linear space of functions continuous on the 
interval a < x < b, equipped with the norm 


If = max [7 (x)]. 


Let K(x, y) be a fixed function of two variables, continuous on the square 
a<x<b,a<y< b, and let A be the operator defined by 


g(x) = Af(x) = [Ke y)f(y) dy. 


Prove that A is a continuous linear operator mapping C,, ,) into itself. 


Problem 4. Let C? 


[4,0] 


equipped with the norm 


be the space of functions continuous on [a, 4], 


fll = : Pe ax, 


and let A be the same as in the preceding problem. Prove that A is a con- 


tinuous linear operator mapping C7, , into itself. 


Problem 5, Given a fixed function (x) continuous on {a, 5], let A be the 
mapping defined by 
g(x) = Af(x) = 9(x) fix). 


Prove that A is a continuous linear operator on both spaces C,,,, and Cj, 4), 
mapping each space into itself. 


Problem 6. Let C\),, be the set of all continuously differentiable functions 
on [a, b], and let D be the differentiation operator, defined by 


Df (x) = f'(*) 
for all f € Ci2),). Prove that 


a) C{2),) is a linear space; 

b) D is a linear operator mapping C{}),, onto C,, 5); 
c) D is not continuous on C,,,); 

d) D is continuous with respect to the norm 


fll, = max | f(x] + max | f"(x)]. 
axenb a<eXb 
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Problem 7. Let K,,,,; be the space of infinitely differentiable functions 
on [a, 5], equipped with the topology generated by the countable system of 
norms 

Ifln = sup 1fC)| 
axe<o 


O<k<n 


(cf. Problem 12a, p. 171). Prove that the differentiation operator D is a 
continuous linear operator on K;, 5,, mapping K;,.,) onto itself. 


Probiem 8. Interpret the differentiation operator as a continuous linear 
operator on the space of all generalized functions. 


Hint. Take continuity to mean that if a sequence of generalized functions 
{f,(x)} converges to a generalized function f(x), then {/, (x)} converges to 


J’ (x). 
Problem 9. Prove that 


a) The operators in Problems 3-7 and Examples 1-4, p. 222 are all 
bounded; 

b) A linear operator on a countably normed space is continuous if and 
only if it is bounded. 


Problem 10. Let A be a bounded linear operator mapping a normed 
linear space E into another normed linear space E,. Suppose ||Al| is defined 
as the smallest number C such that || Af|| < C ||/|| for all x € E. Prove that 
|| Al] is the same number as in the definition on p. 224. Particularize this to 
the case of a bounded linear functional on E. 


Problem 11. Let E and E, be normed linear spaces, and let P(E, E,) be 
the same as in Remark 2 above. Prove that 


a) L(E, E,) is a normed linear space; 
b) If E, is complete, so is P(E, E,); 
c) If E, is complete, A, € A (£, E,) and 


foe} 
Dd IAxll < 0, 
k=1 
then the series 
ao 
> Ax 
k=1 


converges to an operator A €¢ £(E, E,) and 


Dd An 
k=l 














|All = < SIAgl- 
ke==1 
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23. Inverse and Adjoint Operators 


23.1. The inverse operator. Invertibility. Given two topological linear 
spaces Eand E,, let A be an operator from E to E,, with domain D, © E and 
range Ry = {y:y = Ax,x€ D4}. Then A is said to be invertible if the 
equation 

Ax =y (1) 


has a unique solution for every y € Ry. If A is invertible, we can associate 
the unique solution of (1) with each ye Ry. This gives an operator, with 
domain Ry, called the inverse of A and denoted by A~*. 


THEOREM 1. The inverse A- of a linear operator A is itself linear. 


Proof. If 
Ax; =i, AX, = Yas 
then 
Ay, = %; Ay, = Xe, 
and hence 
HAY + HA *Yy = HyXy + HqXp. (2) 
On the other hand, 


A(ayXy + OeXy) = Hi + OeVe, 
by the linearity of A, and hence 
Aa Yy + Oy Yo) == %yXy + ayXq. (3) 
Comparing (2) and (3), we get 
A*(ayy, + Ho) = %A1Y, + %A“* Ye. 


Lemma. If M is an everywhere dense subset of a normed linear space E, 
then every nonzero element y € E is the sum of a Series of the form 
PHP Ve er ee hes 
where y, € M and 


ite el. ey: 





Proof. Since M is everywhere dense in E, given any y € E, there is an 
element y, € M such that 
ILyll 


ly — vill < eet 
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By the same token, there are elements yo, yg,..., Ye, .-- Such that 














Iy—n— val < WA, 
ly — v1 — ye — yall < LA, 
8 
ly — Yr — 0 = Yall < Ul, 
Then 
n 
y—> yl] > 0 
k=1 
as n — oo, by the construction of the sequence {y,}, i.e., the series 
ao 
Sy 
converges to y. Moreover 
3 
Jil = Dba — + oll <b — 9 + yt < Ey gy = SBF, 
lyell = ye +1 —y ty — yall 
Iyll . dyll — 3 Iytl 
< lly-y— ma IS tegoent agente ree 
ly — Yi — Yall + ly — yall i ; 4 
and in general, 
[Yell = We + Yea Foes tb Vy bY Matt ell 
< lly — yi — 0° — Yell + Wy ~ vn — 0 — Yell 
bi, bl 30 y 





Qk get Qk 


THEOREM 2 (Banach). Let A be an invertible bounded linear operator 
mapping a Banach space E onto another Banach space E,. Then the 
inverse operator A~ is itself bounded. 


Proof. Let Mj, be the subset of E, consisting of all y € E, such that 
A-tyll < & Ilyll. 


Every element in £, belongs to some My, ie., 


= U M,. 
k=1 
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By Baire’s theorem (Theorem 3, p. 61), at least one of the sets M,, 
say M,,, is dense in some (open) sphere S © E,. Choosing a point 
YoE S A M,, we can find numbers « and 8 (« < 8) such that S contains 
the spherical layer 


P= {21a <z—yoll < 8,2 M3. 
Shifting P so that its center coincides with the origin, we get another 


spherical layer Py. Some set My is dense in Py. In fact, if ze P A M,, 
then z — yy € Py and 


A(z — yo)ll < WAT2 + Aol < 2CHzll + Ilyoll) 
< n(lz — Yoll + 2 llyoll) 


= mlz — yl (1+ PY mt — sett + Del), 


lz — yall 
; (4) 
where the quantity 


r= n(1 +2 bal) 
« 


is independent of z. Let 
N=1+ [y] 


(recall footnote 4, p. 8). Then, by (4), z —yo€ My. Hence My is 
dense in Py, since M,, is dense in P. 

Now, given any nonzero element y € E,, we can always find a number 
A ~ Osuch that a < ||Ay|| < 8, i-e., such that Ay € Py. Since My is dense 
in Po, there is a sequence {7,}, n, € My converging to Ay. Then {y,/A} 
converges to y. Clearly, if y,¢My, then y,/AG My for any 4 40. 
Therefore My is dense in E, — {0} and hence in E, itself. It follows 
from the lemma that y is the sum of a series of the form 


Set Na, eae. Sr 


where y, € My and 


3 
Iya < zt 


Consider the series 


> (5) 


with terms x, = Ay, €E, equal to the preimages of the elements 
Y, € Ey. Since 
3 ty 


Ix.ll = |A~*yal < N lysll < N—=— 3k 


? 
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the series (5) converges to an element x € E, where 


ao 


< 1 
Ilxll <> ell < 3N ly > = = 3N lly. 
k=1 k= 2 


Since (5) is convergent and the operator A is continuous on E (being 
bounded), we can apply A term by term to (5), obtaining 


Ax = Ax, + AXg + °° + AX, Ft = Yi tb Vetoes HV FH 


which implies 
x = Avy, 
Moreover, 
iA-*yll = Ixll < 34 [yl 


for all y 4 0, and hence A~ is bounded. § 


THEOREM 3. Let Ay be an invertible bounded linear operator mapping 
a Banach space E into another Banach space E,, and let AA be a bounded 
linear operator mapping E into E, such that 


1 
Ao7ll 


A=A,y+AA 


|AA]] < 





(6) 


Then the operator 


maps E onto E, and has a bounded inverse. 


Proof. Let y bea fixed element of £,, and consider the mapping B of 
the space E into itself defined by 


Bx = Ajty — Aj'AAx. 


It follows from (6) that B is a contraction mapping. Hence, by Theorem 
1, p. 66, B has a unique fixed point x such that 


x = Bx = Ajty — Aj ‘AAx. (7) 
But (7) implies 
Ax = Agx + MAx = y. 


Clearly, if Ax’ = y, then x’ is also a fixed point of B, and hence x’ = x. 
Therefore, given any y € E,, the equation Ax = y has a unique solution 
in E, i.e., the operator A is invertible with inverse 4-1. Moreover, 47} 
is bounded, by Theorem 2. fj 


THeoreM 4. Let E be a Banach space, and let I be the identity operator 
on E. Suppose A is a bounded linear operator mapping E into itself, such 
that 

|Al <1. (8) 
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Then the operator (I — A)~ exists, is bounded and can be represented in 
the form 


(Ay = Sa (9) 
k=0 


Proof. The existence and boundedness of (J — A)“ follows from 
Theorem 3 (and will also emerge in the course of the proof). It follows 
from (8) that 


ie} foo} 
LIA‘ < SIA’ < o. 
k=0 k=0 
But then, by the completeness of E, the sum of the series 
> A" 
k=0 


is a bounded linear operator (see Problem Ilc, p. 227). Given any n, 
we have 
n n 
(I — A)S A* = SAU — A) =I AM. 
k=0 k=0 
Hence, taking the limit as n  o and bearing in mind that 


|A"*4| < ||A]"** > 0, 
we get 


(I— A) SAP =I, 
k=0 
which implies (9). jj 


23.2. The adjoint operator. Given two topological linear spaces E and 
E,, let A be a continuous linear operator mapping E into E,, and let g be a 
continuous linear functional on £,, i.e., an element of the conjugate space 
E*. Suppose we apply g to the element y = Ax, thereby obtaining a new 
functional 


f(x) =g(4x) (xe B). (10) 


Clearly, f is continuous and linear (why?), and hence an element of the 
conjugate space E*. Thus (10) associates a functional fe E* with each 
functional g €¢ E**, i.e., (10) defines an operator mapping E* into E*. This 
operator is called the adjoint of A, and is denoted by A*. Using the symmetric 
notation (f, x) for the functional f(x), we can write (10) in the form 


(g, Ax) = Vf, x). 
(g, Ax) = (A*g, x). (11) 


Equation (11) can be regarded as a concise definition of the adjoint of A. 


or 
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Example. As in Example 3, p. 222, suppose A is a linear operator with 
matrix ||@,;|| mapping m-space R™ into n-space R”. Then the mapping y = Ax 
can be written as a system of equations 


yi = 2 aes; (Gi =1,...,”), (12) 
P= 


while the functional f(x) can be written in the form 


fo) = 3 fers 


where f; = f(e,) in terms of a basis e,..., €, in R™. Since 


n nm m n 
f(x) = g(Ax) = > 8iYi = 2 2 865% = 24s 2, 8% 
i= jalj= =. i= 
we find that 


n 
= 3 44584 
£ 


or 
Lh= 248: (13) 


after interchanging the roles of the indices / and j. But f= A*g, and hence 
comparing (12) and (13), we see that the matrix of the operator A* is ||a,,|, 
ie., the transpose of the matrix of A. 


It follows at once from the definition of the adjoint of an operator that 
1) A* is linear; 

2) (A + B)* = A* + Bt; 

3) («A)* = aA* for arbitrary complex «. 


A somewhat less obvious property of the adjoint operator is given by 


THEOREM 5. Let A be a bounded linear operator mapping a Banach 
space E into another Banach space E,, and let A* be the adjoint of A. 
Then A* is bounded and 

|A*l| = lA]. (14) 


Proof. By the properties of the norm of an operator, we have 


\(A*g, x)| = |(g, Ax)i < Ilgll All lll, 
which implies 
|A*gil < |All gl, 
and hence 
\|A*|| < |All. (15) 
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Suppose x € E, Ax # 0, and let 


Yo 


=a I> 
|| Ax| 


so that, in particular, {|yo|| = 1. Let g be the functional such that 


&(AYo) =A 


ontheset L < E, of all elements of the form Ayy. Then clearly (g, yo) = 1, 
lgllon zz = 1. Using the Hahn-Banach theorem, we can extend g to a 
functional on the whole space E, such that ||g|| = 1 and 


(g,¥o) = 1, ie, (g, Ax) = ||Ax|. 
Therefore 


| Ax] = (g, Ax) = [(4*g, x)| < ||A*gil [lal] < A* ll lel tl = LA*l el, 


which implies 
|All < ]A*]. (16) 


Comparing (15) and (16), we get (14). Jj 


23.3. The adjoint operator in Hilbert space. Self-adjoint operators. Next 
we consider the case where A is a bounded linear operator mapping a (real 
or complex) Hilbert space H into itself. According to the corollary to 
Theorem 2, p. 188, the mapping + assigning the linear functional 


(zy)@) = ©, y) 
to every y € H establishes an isomorphism between H and the conjugate 
space H*.? Let A* be the adjoint of the operator A. Then clearly the 
mapping 4* = + 14*+ is a bounded linear operator mapping H into itself, 
such that 
(Ax, y) = (x, A*y) a7) 


for all x, y € H. Moreover || A*|| = || All, since || A*|| = || Al] and the mappings 
+ and +71 are isometric. 

We now establish the following convention: If H is a Hilbert space, then 
by the adjoint of an operator A mapping H into H, we mean the operator 
A* defined by (17). Note that A*, like A, maps H into H. To keep the 
notation simple, we will henceforth drop the tilde, writing A* instead of 
A*, Replacing A* by A* in (17), we get 


(Ax, y) = (x, A*y) d17’) 
for all x, y € A. 





2 Or a “‘conjugate-linear isomorphism” in the case where H is complex (see Problem 6, 
p. 194). 
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Remark, It should be emphasized that this definition of A* differs from 
the definition of the adjoint of an operator A mapping an arbitrary Banach 
space E into itself, in which case A* is defined on the conjugate space E* 
rather than on the space E itself. The context will always make it clear 
whether A* is the operator defined by (11) or the operator defined by (17’). 


Let A be a bounded linear operator mapping a Hilbert space H into itself. 
Then it makes sense to ask whether or not A = A*, since A and A* are 
defined on the same space. This leads to the following 


DEFINITION. A bounded linear operator A mapping a Hilbert space H 
into itself is said to be self-adjoint if A = A*, i.e., if 


(Ax, y) = (x, Ay) 
forallx, y€H. 


Remark. Everything said above continues to hold if we replace H by the 
real n-space R” or complex n-space C”. 


23.4, The spectrum of an operator. The resolvent. In the theory of linear 
operators and their applications, a central role is played by the notion of 
the “spectrum” of an operator.* Let A be a linear operator mapping a 
topological linear space E into itself. Then a number A is called an eigenvalue 
of A if the equation 

AX = Ax 


has at least one nonzero solution, and every such solution x is called an 
eigenvector of A (corresponding to the eigenvalue i). Suppose E£ is finite- 
dimensional. Then the set of all eigenvalues of A is called the spectrum of 
A, and all other values of 4 are said to be regular (points). In other words, 
d is regular if and only if the operator A — AJ is invertible. The operator 
(A — XN) is then automatically bounded, like every operator on a finite- 
dimensional space (cf. Problem 1, p. 226). Thus there are just two possibilities 
in the finite-dimensional case: 


1) The equation Ax = dx has a nonzero solution, i.e., 4 is an eigenvalue 
of A, so that the operator (A — AJ)“ fails to exist; 
2) The operator (A — XN) exists and is bounded, ie., 4 is a regular 
point. 
However, in the case where £ is infinite-dimensional, there is a third 
possibility : 
3) The operator (A — AJ)“ exists (i.e., the equation Ax = Ax has no 
nonzero solutions), but is not bounded. 


3In talking about the spectrum of an operator, it will always be tacitly assumed that 
the operator is defined on a complex space. 
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To describe this more general situation, we introduce some new terminology 
and make an important modification in the definition of the spectrum. 
Given an operator A mapping a (complex) topological linear space E into 


itself, the operator R,=(4—N)? (18) 


is called the resolvent of A. The values of A for which R, is defined for all 
E and continuous are said to be regular (points) of A, and the set of all other 
values of A is called the spectrum of A. The eigenvalues of A still belong to 
the spectrum, since if (4 — 4J)x = 0 for some x ¥ 0, then (18) fails to exist. 
The set of all these eigenvalues is now called the point spectrum, and the rest 
of the spectrum is called the continuous spectrum. In other words, the con- 
tinuous spectrum consists of all 4 for which (18) exists but fails to be 
continuous. Thus there are now exactly three possibilities for any given value 
of 2: 


1) dis a regular point; 
2) 4 is an eigenvalue; 
3) X is a point of the continuous spectrum. 


The possibility of an operator having a continuous spectrum is a character- 
istic feature of the theory of operators in infinite-dimensional spaces, dis- 
tinguishing it from the finite-dimensional case. 


THEOREM 6. Let A be a linear operator mapping a Banach space E 
into itself. Then the set A of all regular points of A is open (equivalently, 
the complement of A is closed). 


Proof. If d is regular, the operator (A — X)~ exists and is bounded. 
Hence, for sufficiently small 5, the operator (A — (A + 8)J)~! also exists 
and is bounded, by Theorem 3. In other words, the point 4 + 3 is reg- 
ular for sufficiently small 3. jj 


THEOREM 7. If A is a bounded linear operator mapping a Banach space 
E into itself and if \d| > ||All, then 4 is a regular point, In other words, 
the spectrum of A is contained in the disk of radius \|A\| with center at the 
origin. 


Proof. Obviously 
A~id = —a(I = .} 
a 


and 


oe 1 AY? 
R,=(A~M)yt = —=(1—-F) . 
ac ens A *) 


If ||Al| <A, then ||4/Al| <1, and hence R, exists and is bounded, by 
Theorem 4. jj 
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Example 1. In the space C = C,, ,,, consider the operator A defined by 
Ax(t) = wd)x(0), 
where u(f) is a fixed function continuous on [0, 1]. Then 


(A — ADx(t) = UD) — A)x(O), 
and 


1 
x 
y(t) — » 
Hence the spectrum of A consists of all 4 such that u(t) — A vanishes for 
some ¢ in the interval [0, 1], i.e., the spectrum is the range of the function 


u(t). 


Example 2. Suppose y(t) = ¢t in the preceding example. Then the spec- 
trum is just the interval [0, 1]. On the other hand, there are obviously no 
eigenvalues. Thus the operator A defined by 


Ax(t) = tx(t) 


is an example of an operator with a purely continuous spectrum. 


(A — Al) x(t) = 





(1). 


Finally, for self-adjoint operators in a Hilbert space, we have the following 
analogue of a well-known result for finite-dimensional Euclidean spaces 
(proved in exactly the same way): 


THEOREM 8. Let A be a self-adjoint operator mapping a (complex) 
Hilbert space H into itself. Then all the eigenvalues of A are real, and two 
eigenvectors of A corresponding to distinct eigenvalues are orthogonal. 


Proof. If 
Ax =>x (x #0), 
then 
Ax, x) = (Ax, x) = (x, Ax) = (x, Ax) = A(x, x), 


and hence A = 2. Moreover, if 


Ax=ax, Ay=puy (Ap), 
then 
Ax, y) = (Ax, y) = (x, Ay) = @, wy) = wr, y) = BO, y), 


and hence 
(x, y) = 0, 


i.e., the vectors x and y are orthogonal. jj 


Problem 1. Given two normed linear spaces E and £,, a linear operator 
A from E to E,, with domain Dy, is said to be closed if x, € D4, X,—>X, 
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Ax,— y implies x € Dy, Ax = y. Prove that every bounded operator is 
closed. 


Problem 2. Let E and £, be normed linear spaces, with norms ||-|| and 
\I'llu, fespectively. By the direct (or Cartesian) product of E and E,, denoted 
by E x E,, we mean the set of all ordered pairs (x, y), x € E, y € Ey. Prove 
that E x E, is a normed linear space when equipped with the norm 


Gs yl = all + Uylls 


(addition of elements and multiplication of elements by numbers being defined 
in the obvious way). By the graph of a linear operator A from E to E, we 
mean the subset of E x E, equal to 


G4 = {(x, y):x € Dy, y = AX}. 
Prove that 


a) G4 is a linear subspace of E x E,; 

b) G, is closed if and only if the operator A is closed; 

c) If E and £, are Banach spaces and if A is closed and defined for all 
x €E, so that D, = E, then A is bounded (this is Banach’s closed 
graph theorem). 


Hint. In c) apply Theorem 2 to the projection operator P carrying each 
ordered pair (x, Ax) € G, into the element x € E. 


Problem 3. Prove that if A is an invertible continuous linear operator 
mapping a complete countably normed space FE into another complete 
countably normed space E,, then the inverse operator A“ is itself continuous. 
State and prove the closed graph theorem for countably normed spaces. 


Problem 4. Let A be a continuous linear operator mapping a Banach 
space £ onto another Banach space E£,. Prove that there is a constant « > 0 
such if Be L(E, E,) and ||A — B|| < «, then B also maps E onto (all of) Ey. 


Problem 5, Let A be an operator mapping a Hilbert space H into itself. 
Then a subspace M © H is said to be invariant under A if x € M implies 
Ax € M. Prove that if M is invariant under A, then its orthogonal com- 
plement M’= HOM is invariant under the adjoint operator A* (in 
particular, under 4A itself if A is self-adjoint). 


Problem 6. Let A and B be bounded linear operators mapping a complex 
Hilbert space # into itself. Prove that 


a) (aa + BB)* = ZA* + BB*; 

b) (AB)* = BtA*; 

c) (A*)* = A; 

d) I* = I, where / is the identity operator. 
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Problem 7. Give an example of an operator whose spectrum consists of 
a single point. 

Problem 8. Given a bounded linear operator A mapping a Banach space 
E into itself, prove that the limit 

r=lim VA" 
NAO 

exists. Show that the spectrum of A is contained in the disk of radius r 
with center at the origin. 


Comment. The quantity r is called the spectral radius of the operator A. 
This result contains Theorem 8 as a special case, since ||A”|| < ||A|l”. 


Problem 9. Let R, = (A — AI) and R, = (A — wt) be the resolvents 
corresponding to the points 4 and ». Prove that R,R, = RR, and 


R, — R = (&—DR,R,.- (19) 
Hint. Multiply both sides of (19) by (4 — AN(A — pl). 


Comment. It follows from (19) that if A» is a regular point of A, then 
the derivative of R, with respect to A at the point Ag, i.e., the limit 


lim Rigpar — Rag 
AAO Ar 


(in the sense of convergence with respect to the operator norm) exists and 
equals Rj.. 


Problem 10. Let A be a bounded self-adjoint operator mapping a complex 
Hilbert space H into itself. Prove that the spectrum of A is a closed bounded 
subset of the real line. 


Problem 11. Prove that every bounded linear operator defined on a com- 
plex Banach space with at least one nonzero element has a nonempty 
spectrum. 


24. Completely Continuous Operators 


24.1. Definitionsand examples. We now discuss a class of operators which 
closely resemble operators acting in a finite-dimensional space and at the 
same time are very important from the standpoint of applications: 


DEFINITION. A linear operator A mapping a Banach space E into 
itself is said to be completely continuous if it maps every bounded set into 
a relatively compact set. 
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Remark I. If E is finite-dimensional, then every linear operator A 
mapping E into £ is completely continuous. In fact, A maps bounded sets 
into bounded sets (recall Problem 1, p. 226) and hence maps bounded sets 
into relatively compact sets (why ?). 


Remark 2. In an infinite-dimensional space, complete continuity of an 
operator is a stronger requirement than merely being continuous (i.¢., 
bounded). For example, the identity operator in an infinite-dimensional 
space is continuous but not completely continuous (see Example | below). 


Lema. Let X;, X2,... be linearly independent vectors in a normed 
linear space E, and let E, be the subspace generated by the vectors 
X1, +++ 5X,- Then there are vectors y,, yo, ... such that yy € E,, Yall = 1 
and* 

o(E,-1 Yn) = inf |x =< Vall > 3. 
eB n—1 

Proof. Since the vectors x,, Xz, ... are linearly independent, we have 

x, ¢ E,-1 and hence 
(Ena, Xn) =a>0 


(recall Problem Sa, p. 141). Let x* be a vector in E,_, such that 


|X, — x*|| < 2a. 
Then 
e(E,-1 Xn x*) =a, 
and the vectors 
x X,_,— x* 
s = (n = 2,3,...) 


u= _ , 
Ix, ~~ x*|| 


oS eae 
lll 


satisfy all the conditions of the lemma. jj 


Example 1. The identity operator J in an infinite-dimensional Banach 
space E is not completely continuous. In fact, we need only show that the 
closed unit sphere S in E (which is obviously carried into itself by J) is not 
compact. This follows at once from the lemma, since S contains a sequence 
of vectors y,, Ye,... such that 


env J. n) > $, 
and such a sequence clearly cannot contain a convergent subsequence. 


Example 2, Let A be a continuous linear operator on an infinite-dimen- 
sional] Banach space E, where A is “‘degenerate’’ in the sense that it maps 
E into a finite-dimensional subspace of E. Then A is completely continuous, 


4 The quantity e(En_1, ya) is, of course, just the distance between the set E,_, and the 
point y, (cf. Problem 9, p. 54). 
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since it maps every bounded subset M ¢ E into a bounded subset of a 
finite-dimensional space, and hence into a relatively compact set. 


Turning to the space C;,,; of functions continuous on the interval [a, 5], 
we now establish conditions under which the “integral operator’ A defined 


by. 
V(x) = (A9)(x) = [PKC »)@0) dy (1) 

is completely continuous. 
THEOREM 1. Suppose the kernel K(x, y) is such that 


1) K(x, y) is bounded on the squarea<x<ba<y<b; 
2) The discontinuities (if any) of K(x, y) all lie on a finite number of 
curves 
yp=fi(x) (K=1,...,n), 


where the functions f,, are continuous. 
Then (1) is a completely continuous operator mapping Cjq,y\ into Cra»). 
Proof. First we note that the conditions 1) and 2) guarantee the 


existence of the integral (1) for every x € [a, 5], so that )(x) is defined 
on [a, b]. Let R be the squarea< x<b,a<y< b, and let 


M = sup |K(x, y)l- (2) 
(a, ne R 


Moreover, let G be the set of all points (x, y) € R such that 


& 
12Mn 





ly — f(x) < 


for at least one integer kK = 1,...,n, and let F= R — G. Since F is 
compact (why?) and K(x, y) is continuous on F, given any « > 0, there is 
a 8 > 0 such that 


€ 





K(x’, y) — K(x’, < 3 

IK) — KO" I< (3) 
for any two points (x’, y), (x", y) € F satisfying the condition 

|x’ — x" <8 (4) 


(recall Theorem 1, p. 109). 
Now suppose (4) holds. Then 


IM) — 9) < PIKE, — KE" MleOlay. 
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To estimate the integral on the right, we divide the intervala << y <b 
into the set 


n 
P= U <—Ju U | % cols WY) Taal 
U fy AC < pe) VU fy AG < 55 
and the complementary set Q = [a, b] — P. Using (2) and noting that 
P is a union of intervals of total length no greater than ¢/3M, we have 
t “uw 2 

[1KG", ») — KG", I 190) dy < lel (6) 

where, as usual, 


ell = sup [o()I. 
ayo 
On the other hand, it follows from (3) and (4) that 
JoIK(', 9) — KO", Ie ay <5 Hell: © 
Comparing (5)-(7), we find that (4) implies 
Ib’) — ¥")| < [lel (8) 


In particular, ) is continuous on [a, 5], so that the operator A defined by 
(1) actually maps the space C;,,,; into itself. Moreover, it follows from 
(8) and from the‘estimate. 


Ill = sup |$(x)| < sup PiKe, Yl lel dy < M(b — a) lel 
a<e<b axe<xp°a 


that A carries any (uniformly) bounded set of functions ® < C,, ,; into 
a (uniformly) bounded equicontinuous set ¥ ¢ C,, ,, (recall Definitions 
3 and 4, p. 102). But then ¥ is relatively compact, by Arzela’s theorem 
(Theorem 4, p. 102), and hence A is completely continuous. jj 


Remark 1. The requirement that the discontinuities of the kernel K(x, y) 
lie on a finite number of curves, each intersecting the lines x = const in a 
single point, is essential. For example, let K(x, y) be the function 


1 if x <4, 
K(x, y) = 
0 if x > 4, 


defined on the square0 < x < 1,0<y < 1. Then K(®, y) is discontinuous 
at every point of the line segment x = 3, 0 < y < I, and the operator (1) 
with this kernel maps the function x(t) = 1 into a discontinuous function. 


Remark 2. If K(x, y) = 0 for y > x, then (1) takes the form 


W(x) = (Ag)(x) = [PKC y)q(y) dy. 
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Suppose K(x, y) is continuous for y < x. Then it follows from Theorem 1 
that the operator A, called a Volterra operator, is completely continuous. 


24.2, Basic properties of completely continuous operators. We begin with 


THEOREM 2. Given a sequence {A,} of completely continuous operators 
mapping a Banach space E into itself, suppose {A,} converges in norm to an 
operator A, i.e., suppose ||A — A,|| +0 as n + o. Then A is itself 
completely continuous. 


Proof. To prove that A is completely continuous, we need only show 
that the sequence {Ax,} contains a convergent subsequence whenever 
the sequence {x,} of elements x, € E is bounded, i.e., such that 


IXall <M (9) 


for some M > 0 and all n = 1,2,... (why is A linear?). Since A, is 
completely continuous, the sequence {A,x,} contains a convergent 
subsequence. In other words, there is a subsequence {x‘} of the sequence 
{x,} such that {A,x} converges. Similarly, since A, is completely con- 
tinuous, the sequence {A,x“!)} in turn contains a convergent subsequence. 
Thus there is a subsequence {x')} of the sequence {x‘} such that {4,x'?)} 
converges. Then obviously {A,x‘?} also converges. Continuing this 
argument, we find a subsequence {x‘>)} of the sequence {x‘?)} such that 
{Ax}, {4.x}, {4x} all converge, and so on. Consider the 
“diagonal sequence”’ 
ON as Be Nos 

The clearly each of the operators A,, Aj,...,A,,... maps this 
sequence into a convergent sequence. 

We now show that the sequence {Ax} also converges, thereby 
completing the proof. Since the space Fis complete, it is enough to show 
that {Ax‘”)} is a Cauchy sequence, Clearly 


Ax — Axe < Ax? = Ae + Agee? — Ae 
+ Ang? — Axy? |. (10) 
Given any ¢ > 0, first choose & such that 
|A — All < ave (11) 
Next, using the fact that {4,x'™} converges and hence is a Cauchy 
sequence, choose N such that 


A,X) — Ax || < 5 (12) 
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for all n,n’ > N. Then it follows from (9)—(12) that 
Axi — Ax ci f4F ne 
I I 37373 


for all sufficiently large n and n’, i.e., {Ax‘”} is a Cauchy sequence. Jj 


Not only is the set of completely continuous operators closed (algebra- 
ically) under operator multiplication, but we have the following much stronger 
result: 


THEOREM 3. Let A be a completely continuous operator and B a 
bounded operator mapping a Banach space E into itself. Then the operators 
AB and BA are completely continuous. 


Proof. If the set M < Eis bounded, then BM = {y:y = Bx, x € M} 
is also bounded. Therefore ABM is relatively compact, and hence AB 
is completely continuous. Moreover, if M is bounded, then AM is 
relatively compact, and hence BAM is also relatively compact by the 
continuity of B, i.c., BA is completely continuous. jj 


CoROLLARY. A completely continuous operator A mapping a Banach 
space E into itself cannot have a bounded inverse if E is infinite-dimensional. 


Proof. If A} were bounded, then, by Theorem 3, the identity 
operator I = A~*A would be completely continuous. But this is im- 
possible, by Example 1, p. 240. § 


THEOREM 4. Let A be a completely continuous operator mapping a 
Banach space E into itself. Then the adjoint operator A* is also completely 
continuous. 


Proof. We must show that A* carries every bounded subset of the 
conjugate space E* into a relatively compact set. Since every bounded 
subset of a normed linear space is contained in some closed sphere, it 
is enough to show that A* maps every closed sphere into a relatively 
compact set. In fact, by the linearity of A*, we need only show that the 
image A*S* of the closed unit sphere S* ¢ E* is relatively compact. 

Now suppose we regard the elements of E* as functionals not on the 
whole space E but only on the compactum [AS] equal to the closure of 
the image of the closed unit sphere under the operator A. Then the set D 
of functionals on [AS] corresponding to those in S* is uniformly bounded 
and equicontinuous, since |||] < 1 implies 


sup |9(x)| =sup|(x)| < || ¢l] sup |.Ax|| < |All 
wel AS] we AS zeS 

and 
lo(x’) — 9(x")| < [loll lx” — x" || < Ix’ — x". 
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Hence, by Arzela’s theorem (Theorem 4, p. 102), ® is relatively compact 
in the space Cy; of all continuous linear functionals on [AS]. But the 
set ®, with the metric induced by the usual metric of C;45;, is isometric 
to the set A*S*, with the metric induced by the norm of the space E*. 
In fact, if g,, 2. € S*, then 


|A*g, — A*gol| = sup |(A*g, — A*go, x)| = sup |(g, — 82, AX)I 
eS reS 
= sup|(g; — 82, z)| = sup |(g, — 82, Z)| = (81, 82). 
zeAS ze(.4S] 


Being relatively compact, the set ® is totally bounded, by Theorem 3, 
p. 101. Therefore the set A*S* isometric to ® is also totally bounded, 
and hence relatively compact, by the same theorem. Jj 


THEOREM 5. Let A be a completely continuous operator mapping a 
Banach space E into itself. Then, given any 9 > 0, there are only finitely 
many linearly independent eigenvectors of A corresponding to eigenvalues 
of absolute value greater than ¢. 


Proof. Given nonzero eigenvalue A of A, let E, be the subspace of E 
consisting of all eigenvectors of A corresponding to 4.5 Then E, is 
finite-dimensional, since otherwise A would fail to be completely con- 
tinuous in E, and hence in E itself, by virtually the same argument as in 
Example 1, p. 240. Therefore, to complete the proof, we need only show 
that if {A,,} is any sequence of distinct eigenvalues of A, then A, — 0 as 
n—» oo, This in turn will be proved once we show that thete is no infinite 
sequence {A,,} of distinct eigenvalues of A such that the sequence {1/A,} 
is bounded. 

Thus, suppose there is a sequence {A,} of distinct eigenvalues of A 
such that {1/A,} is bounded, and let x, be an eigenvector of A corre- 
sponding to the eigenvalue ,. Then the vectors x,, X2,... are linearly 
independent, by the same argument as in the case where E is finite- 
dimensional.* Let E,, be the subspace generated by x1,..., X,, i.e., the 
set of all elements of the form 


5 Note that E, is invariant under A in the sense that x € E, implies Ax € E, (cf. Problem 
5, p. 238). 
6 See e.g., G. E. Shilov, op. cit., Lemma 1, p. 182. 
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so that 


1 
——AYyEcE, 4. 
y d y 1 


n 


Let {y,} be a sequence such that y, € E,, ||y,| = 1 and 
(En 1 Yn) = inf |x rat Yall > 3 
wey -1 


(such a sequence exists by the lemma on p. 240). Then {y,/A,} is a 
bounded sequence in E, since the numerical sequence {1/A,} is bounded. 
But at the same time the sequence {A(y,,/A,)} cannot contain a convergent 
subsequence, contrary to the complete continuity of A, since 


KG) 40) 


for all p > qg, since 


~~ 
2 














1 y 
=P | A(x 
y= 5 AY + (33) 


ve 








Vo — + Ay, - 4(33) € Ey-y. 


Dp a 


This contradiction proves the theorem. jj 


24.3. Completely continuous operators in Hilbert space. Specializing to 
the case of completely continuous operators mapping a Hilbert space into 
itself, we have 


THEOREM 6. Let A be a linear operator mapping a Hilbert space H 
into itself. Then A is completely continuous if and only if 


1) A maps every relatively compact set in the weak topology into a 
relatively compact set in the strong topology; 

2) A maps every weakly convergent sequence into a strongly convergent 
sequence. 


Proof. To prove 1), we merely note that H is the conjugate of a 
separable space, since H = H*, and hence, by Corollary 2’, p. 205, a 
subset of H is bounded if and only if it is relatively compact in the weak 
topology. 

To prove 2), suppose A maps every weakly convergent sequence 
into a strongly convergent sequence, and let / be a bounded closed sub- 
set of H. Then M contains a weakly convergent sequence and hence 4M 
contains a strongly convergent sequence, i.e., 4M is relatively compact 
in the strong topology. It follows that A is completely continuous. 
Conversely, if A is completely continuous, let {x,} be a weakly convergent 
sequence with weak limit x. Then {4x,} contains a strongly convergent 
subsequence. At the same time, {Ax,} converges weakly to Ax, by the 
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continuity of A, so that {Ax,} cannot have more than one limit point. 
Therefore {Ax,,} is a strongly convergent sequence. jj 


Let.A be a self-adjoint operator in a finite-dimensional complex Euclidean 
space, and suppose A has matrix ||a,;|| (recall Example 3, p. 222). Then it 
will be recalled from linear algebra that ||a,,| can be reduced to diagonal 
form with respect to a suitable orthonormal basis.’ We now generalize this 
result to the case of a completely continuous self-adjoint operator in a (real 
or complex) Hilbert space (see Theorem 7 below), after first proving two 
preliminary lemmas: 


Lemma 1. Let A be a completely continuous self-adjoint operator 
mapping a Hilbert space H into itself, and let {x,} be a sequence in H 
converging weakly to x. Then 


(Axn» X_) > (AX, x) (13) 
asn— ©, 


Proof. Clearly, 
\(Axn, Xn) = (Ax, x)| < |(Axa, Xn) a (Ax, x,)I + |(Ax, Xn) _ (Ax, x)I. 


But 
[(AXns Xn) ae (Ax, xy) < xn |ACn = x)IL, 
and 
|(Ax, xn) — (Ax, x)| = 1%, AG, — *)) < Il] AG, — XDI, 


where the numbers ||x,||, 7 = 1,2,... are bounded, by Theorem 2, 
p- 196, and || A(x, — x)l| + 0 by Theorem 6. Therefore 


|(AXns Xn) =. (Ax, x)| 0 
as n—> 00, which is equivalent to (13). jj 


LEMMA 2. Given a bounded linear operator A mapping a Hilbert space 
H into itself, let A be self-adjoint and suppose the least upper bound of the 
functional 


1O(x)| = |(Ax, x) 


on the closed unit sphere |\|x|\| < 1 is achieved at the point x = x». Then 


(x, y) = 0 (14) 


(Axo, Y) = (%, Ay) = 0. 


In particular, xq is an eigenvector of A. 


implies 





7Seé e.g., V. I. Smirnov, Linear Algebra and Group Theory (translated by R. A. 
Silverman), McGraw-Hill Book Co., New York (1961), Sec. 40. Dover reprint (19/0). 
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Proof. Obviously, 
\|xol| = 1. (15) 
Let 
x= a Bria OE an ahd 7? 
V1 + lal? Wlyll? 
where a is an arbitrary complex number. Then ||x|| = 1, because of 
(14) and (15). Since 


Q(x) = 


1+ fal? lly? 
we have 


Q(x) = Oxo) + 2 Re a(x, y) + O(a) (16) 


for small |a|. But it is clear from (16) that if (Ax, y) 4 0, then a can be 
chosen to make |Q(x)| > |Q(x»)|, contrary to the assumption that the 
least upper bound of |Q(x)| on the closed unit sphere is achieved at the 
point x = Xo. Therefore (Ax, y) = 0 as asserted, i.e., A is orthogonal 
to every vector orthogonal to xX». It follows that Ax» and x, are pro- 
portional (why ?), so that x, is an eigenvector of A. jj 


[Q(x%o) + 2 Re a( Axo, y) + lal* O(y)], 


THEOREM 7 (Hilbert-Schmidt). Let A be a completely continuous self- 
adjoint operator mapping a Hilbert space H into itself. Then there is an 
orthonormal system 1, $2, ... of eigenvectors of A, with corresponding 
nonzero eigenvalues ,, Ay, .. . , such that every element x € H has a unique 
representation of the form® 


X= Lenn + x, (17) 
where x’ satisfies the condition Ax' = 0. Moreover 
Ax = Yalu Pns (18) 
and 
lim A,, = 0 
in the case where there are infinitely many nonzero eigenvalues. 
Proof. Let 


M, = sup \(Ax, x)|, 
els 
we 


and let {x,} be a sequence of elements of H such that ||x,|| = 1 and 
(AXns xn) =, My, 
as n -> o, Since the closed unit sphere in H is weakly compact (recall 





5 As will appear in the course of the proof, the sums in (17) and (18) may be finite or 
infinite, and x’ may vanish. 
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Corollary 2’, p. 205), we can find a subsequence of {x,} which converges 
weakly to an element y € H, where clearly ||y|| < 1. By Lemma 1, 


\(Ay, y)| = Ma, 


and hence, by Lemma 2, y is an eigenvector of A. Moreover ||y|| = 1, 
since if ||y|| < 1, then choosing 
, y 
2 A eS 
yl 
we would have |[y’|| = 1 and 


(4y', x) > Mh, 


contrary to the meaning of M,. We choose y as our first eigenvector ¢. 
Let A, be the corresponding eigenvalue, so that 


AQ, = AV ?1 
Then 
Al = ICA a, Oy] = My. 


Next let E, be the subspace of H consisting of all vectors of the form 
a, and let E’ = H © E, be the orthogonal complement of E,. Clearly 
E, is again a Hilbert space, mapped into itself by the operator A (this 
follows from Problem 5, p. 238 and the fact that A is self-adjoint). Let 

M, = sup |(Ax, x)|. (19) 


lol] <1 
wey 
Then, by the same argument as before, we can find an eigenvector 9, of 
A such that $2 € Ej, |||] = 1. Let A, be the corresponding eigenvalue, 
so that 
Ae = Azo. 
Then 
lAel = I(A¢a, P2)] = Ma, 
and hence 
[Ag] > [rel, 
since H > E, implies 
M, = sup |(Ax, x)| > sup |(Ax, x)| = M2. 
(la ]] <1 lel <2 
oe fT weHy 
By its very construction, 9, is orthogonal to 4. 
To construct further eigenvectors of A, we argue inductively, re- 
placing (19) by 


Masi = sup |(Ax, x)| (n = 1,2,...), 
Iol]<t 


weE, 
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where E? = H © E, is the orthogonal complement of the subspace E,, 
generated by the previously constructed eigenvectors 91, @2,..., Pp. 
Then E’ is again a Hilbert space mapped into itself by A, and there is an 
eigenvector 9,,,€E, of unit norm, with corresponding eigenvalue 
Ans Satisfying the inequality 
[nl > [Antal (n = 1, 2, <8 .). 
In this way, we construct an orthonormal system {¢,} of eigenvectors of A. 
There are now just two possibilities, which we examine in turn: 


Case 1. Suppose the construction of the sequence {¢,,} terminates after 
a finite number of steps, i.e., suppose there is a positive integer ny such 
that (Ax, x) =0 on E’. Then it follows from Lemma 2 that 4 maps 
the whole space E,,, into the zero vector. According to Theorem 14, 
p. 158, every element x € H has a unique representation of the form 


x=h+x, 
where he E,,> xe E,,,» and hence of the form 


x= DP erPa + x's 
where the sum is finite (consisting of n) terms) and Ax’ = 0. Obviously 
we have 

AX = ¥ nln Pns 
thereby completing the proof in this case. 


Case 2. Suppose the construction of the sequence {¢,} never termi- 
nates, ic., suppose (Ax, x) 40 on £, for alln =1,2,.... We then 
have infinitely many nonzero eigenvalues Ay, Ag,...,A,,.... Clearly 
A, ~O0asn — o. In fact, the sequence {¢,,} converges weakly to zero, 
like any sequence of orthonormal vectors (why?), and hence the se- 
quence {A¢,} converges to zero in norm, so that ||A¢,,|| — 0 and hence 
]A,, Pall = |A,,| +0. Let E,, be the subspace of H generated by all the 


eigenvectors 91, g,.--5 Pao++ +» 1€., the set of all linear combinations 
of the form 

fo a) 

> Cn Ons 

n=1 
and let 


E, = HOE, =N E;, 
n=1 
If E, = {0}, then H = E,, and x obviously has a representation of the 
form (17) with x’ = 0 (so that Ax’ = 0 trivially). If E%, 0, let x be any 
nonzero element of E{,. Then 


|(Ax, x)] < [Anl lll]? 
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for alln =1,2,...,and hence (Ax, x) =0 on E%. It follows from 
Lemma 2 that A maps the whole space E,, into the zero vector. The rest 
of the proof is the same as in Case 1, where (18) follows from (17) by 
the continuity of A. jj 


CoroLiary. Let A be a completely continuous self-adjoint operator 
mapping a Hilbert space H into itself. Then there is an orthonormal 
system {),,} of eigenvectors of A such that every element x € H has a unique 
representation of the form 


x= s Can: 


n=l 
Moreover 


foe} 
Ax = Yalan 
n=1 
where dy, Ag, ... are the eigenvalues corresponding to 1, ve,.... 


Proof. Noting that every element of E,, o1 E,, is an eigenvector of A 
corresponding to the eigenvalue } = 0, let {),} consist of the ortho- 
normal system {9,} constructed in the proof of Theorem 7, together 
with an arbitrary orthonormal basis in E,, or Ey. fj 


Problem 1. Prove that the projection operator of Example 4, p. 222 is 
completely continuous if and only if the subspace H, is finite-dimensional. 
Problem 2, Prove that the operator A mapping the point 
X = (%1, %e,.--5X_-- JER 
into the point 





Xg x 
Ax = (oR Jeb 
is completely continuous. More generally, suppose 
AX = (QyX1, GpXq,-..5AnXnr- +.) 


Under what conditions on the sequence {a,} is A completely continuous? 


Hint. Since every bounded set in /, is contained in some closed sphere, 
it is enough to show that the images of spheres are relatively compact. In 
fact, by the linearity of A, it need only be shown that the image of the unit 
sphere is compact. In this regard, recall Example 5, p. 98. 


Problem 3. Let A be the integral operator on C,_, 1; defined by 
Hx) = (A9)() =[* 9) ay. 


Prove that A maps the closed unit sphere in C,_;,, into a noncompact set. 
Reconcile this with Theorem 1. 
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Aint. Let 
0 if -1<x<0Q, 
| 
nx if O<x<-, 
Ox) = n 
ia0 ok 
1 if -<x<1l. 
n 
Then 9, € C11) Ini = 1 for all n, and 
0 if -l<x<Q, 
Gai nly ey ee 
n(X) = (Agn(x) = ( 2 n 
1 
x—- => if -<x<l 
2n 


The sequence {),,} converges in C,_, 1, to the function 


bee) 0 if -l<x<0, 
x)= 
x if O<x<l, 
which, having a discontinuous derivative, cannot be the image under A of 
any function in C,_, 1). 

Problem 4. Let A be a completely continuous operator mapping a 
reflexive Banach space E (e.g. a Hilbert space) into itself. Prove that A maps 
the closed unit sphere in Z into a compact set. Reconcile this with the pre- 
ceding problem. 


Hint, Use Theorem 6, p. 205. 
Problem 5, Prove that 


a) A linear combination of completely continuous operators is itself a 
completely continuous operator; 

b) The set @(E, £) of all completely continuous operators mapping a 
Banach space £ into itself is a closed subspace of the linear space 
L(E, E) of all bounded linear operators mapping E into E. 


Problem 6. Let @(E, E) and #(E, E) be the same as in the preceding 
problem. Prove that besides being a linear space, &(E, E) is also a ring 
when equipped with the usual operations of addition and multiplication of 
operators. Prove that @(E, E) is a two-sided ideal in #(E, E). 


Comment. By a two-sided ideal in a ring & is meant a subring 7c Z 
such that ae W%, re & implies are H, rac L. 
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Problem 7. Let ® and A*S* be the same as in the proof of Theorem 4. 
Show that ® is closed and hence compact. Deduce from this that A*S* is 
compact, even though as shown in Problem 3, the image of the closed unit 
sphere under a completely continuous operator need not be compact. 


Problem 8. Discuss the connection between Theorem 4 and the theory of 
Sec. 20.4, in particular Corollary 1’, p. 204. 


Problem 9. Let A be a bounded linear operator mapping a Banach space 
E into itself. Show that if A* is completely continuous, then so is A. 


Problem 10. Prove that a linear operator A mapping a Hilbert space H 
into itself is completely continuous if and only if its adjoint (in the sense 
of Sec. 23.3) is completely continuous. 


Problem 11. Give an example of a completely continuous operator A 
mapping a Hilbert space H into itself, such that A has no eigenvectors. 
Reconcile this with Theorem 7. 


Hint. Let A be the operator in /, such that 


Ax = AG Hee ke (Oar AEE.) 
2 n—l 
Then Ax = Ax implies 
eS Ohne ei ey Se, BS iad 
2 n—1l 


and hence x = 0. 


Comment. This situation differs from the finite-dimensional case, where 
every linear operator (self-adjoint or not) has at least one eigenvector. 


| 


MEASURE 


The concept of the measure u.(Z) of a set E is a natural generalization of 
such concepts as 


1) The length /(A) of a line segment A; 

2) The area A(F) of a plane figure F; 

3) The volume V(G) of a space figure G; 

4) The increment (b) — ¢(a) of a nondecreasing function ¢(f) over a 
half-open interval [a, b); 

5) The integral of a nonnegative function over a set on the line or over 
a region in the plane or in space. 


Although the notion of measure first arose in the theory of functions of a 
real variable, it was subsequently used extensively in functional analysis, 
probability theory, the theory of dynamical systems, and other branches 
of mathematics. In Sec. 25 we discuss the measure of plane sets, starting 
from the notion of the area of a rectangle. Measure in general will then 
be studied in Secs. 26 and 27. The reader will easily confirm that the con- 
siderations in Sec. 25 are of a general nature and carry over to the case of 
the more abstract theory without essential changes. 


25. Measure in the Plane 


25.1. Measure of elementary sets. Consider the system of sets in the 
xy-plane, each defined by one of the inequalities 


a<x<b, a<x<b, a<cx<b, a<x<b 
254 
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and one of the inequalities 
e<y<d, e<y<d, e<y<d, c<y<d, 


where a, b, c and d are arbitrary real numbers. The sets in ¥ will be called 
rectangles. The closed rectangle defined by the inequalities 


a<x<b ccy<d 


is a rectangle in the usual sense (including its boundary) if a < b and c < d, 
a line segment (including its end points) if a = b and c < dor if a < b and 
c = d,a point if a = b, c = d, or even the empty set ifa > b orc > d. The 
open rectangle 

a<x<b, e<y<d 


is either a rectangle in the usual sense (without its boundary) if a < 6 and 
c <d or the empty set if a> b or c >d. Each of the rectangles of the 
remaining types will be called half-epen and is an ordinary rectangle minus 
one, two or three sides, a line segment minus one or two end points, or 
possibly the empty set. 

In keeping with the concept of area familiar from elementary geometry, 
we now define the measure of each set in as follows: 


1) The measure of the empty set equals 0; 
2) The measure of the nonempty rectangle (closed, open or half-open) 
specified by the numbers a, b, c, and d equals 


(b — a)(d — c). 


Thus with each rectangle Pe Y we associate a number m(P), called its 
measure, where clearly 


1) m(P) is real and nonnegative; 
2) m(P) is additive in the sense that if 


n 
P=UP, P,OAP,= @ 
k=l 
then 


m(P) = ¥ m(P,). 
k=l 


Our problem is to define the concept of measure for sets more general than 
rectangles, while preserving these two properties. The first step in this 
direction is to define measure for elementary sets, where by an elementary 
Set we mean any set which can be represented in at least one way as a union 
of a finite number of pairwise disjoint rectangles. First we prove 


THEOREM 1. The union, intersection, difference and symmetric 
difference of two elementary sets are again elementary sets. 
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Proof. If 
A=UP ke B=U Q, 
k u 
are two elementary sets, then clearly 
AQNB=U(P,9Q)) 

k,l 
is also an elementary set, since each P, © Q, is obviously either a 
rectangle or the empty set. Moreover, it is easy to see that the difference 
of two rectangles is an elementary set. Hence, subtracting an elementary 
set from a rectangle gives another elementary set (as an intersection of 
elementary sets). Suppose A and B are elementary sets, and let P be a 


rectangle containing both of them (such a rectangle obviously exists). 
It follows from what has just been proved that 


AUB=P—[P—-—AN(P-B)] 
is an elementary set. It is then an easy consequence of the formulas 
A—B=AN(P—B), 
AAB=(AUB)— (ANB) 


that the difference and symmetric difference of two elementary sets is 
again an elementary set. §j 


Remark. In other words, the system of all elementary sets is a ring Z, 
as defined on p. 31. 


We now define measure for elementary sets: 
DEFINITION 1. Given an elementary set A, suppose 
A “= U P. ko 
k 


where the P,, are pairwise disjoint rectangles. Then by the measure of A, 
denoted by m(A), is meant the number 


m(A) = J m(P,), (1) 
k 
where m(P,) is the measure of the rectangle P. 
Remark. Clearly, m(A)is nonnegative and additive. Moreover, in defining 
m(A), we have tacitly relied on the fact that the sum (1) does not depend on 
how A is represented as a union of sets. To verify this, suppose 


A=Up,=UQ, 
k 1 


where P,, and Q, are rectangles such that 
P,OP;= 2, 8,190; = @ (i #/)). 
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Since the intersection P, \ Q, of two rectangles is itself a rectangle, it follows 
from the additivity of the measure of rectangles that 


p m(P,) =X m(P, OQ) = x m(Q)). 


yl 
THEOREM 2. If A is an elementary set and {A,} is a finite or countable 
system of elementary sets such that 
AcUA,, 
then : 
mA) < X m(A,). (2) 


Proof. Given any ¢ > 0, there is ‘a closed elementary set A contained 
in A and satisfying the condition 


m(A) > m(A) — ; 
In fact, to get A we need only replace each of the k rectangles P; making 


up A by a closed rectangle contained in P, of area no less than 


m(P,) — i ; 


Moreover, for each A,, thete is clearly an open elementary set 4, contain- 
ing A, and satisfying the condition 


(An) < M(A,) + = 


or 
Obviously, 
AcUA,. 


Hence, by the Heine-Borel theorem (recall p. 92), there is a finite 
system A, ,...,A,, covering A, where 


$s 
m( A) < > m(A,,), 
t=1 


since otherwise A would be covered by a finite number of rectangles of 
total area less than m(A), which is impossible. Therefore 


mi(A) < mA) +2 < ¥ md) +S < Zim, + § 





aM 
= 


Ay) + Dat 5 = LAA.) +6, 


which implies (2), since « > 0 is arbitrary. Jj 
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25.2. Lebesgue measure of plane sets. Elementary sets are, of course, far 
from being the most general plane sets considered in geometry and analysis. 
Thus we naturally arrive at the problem of extending the concept of measure 
(while preserving its basic properties) to sets more general than finite unions 
of rectangles with sides parallel to the coordinate axes. This problem is 
solved in a definitive way by Lebesgue’s theory of measure, in which we 
consider countably infinite unions of rectangles, as well as finite unions. 
To avoid sets of “infinite measure,” we restrict our discussion to subsets 
of the closed unit square E, defined by the inequalities 


O<x<l, 0<y<l 
(this restriction is dropped in Remarks 2 and 3, p. 267). 


DEFINITION 2. By the outer measure of a set A < E is meant the 
number 


uX(A) = inf > m(P,), 
AcUP; k 


where the greatest lower bound is taken over all coverings of A by a finite 
or countable system of rectangles P,,. 


DEFINITION 3. By the inner measure of a set A < E is meant the 
number 
Hx (A) = 1 — p*(E — A). 


THEOREM 3. The inequality 


Uy (A) < 2*(A) 
holds for any set AC E. 


Proof. Suppose 
x (A) > 2*(A), 


u*(A) + wX(E— A) <1. 


Then, by the definition of a greatest lower bound, there are systems of 
rectangles {P;} and {Q,} covering A and E — A, respectively, such that 


x m(P5) + 3 m(Q,) < 1. 
k 
Let {R,} denote the union of the systems {P,} and {Q,}. Then 
Ec UR, 
U 


ie., 


while 
m(E) > X m(R,), 
i 
contrary to Theorem 2. §j 
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DEFINITION 4. A set A is said to be (Lebesgue) measurable if 
U4 (A) = v*(A), 
ie., if its inner and outer measures coincide. 


DEFINITION 5. Ifa set A is measurable, the number u(A) equal to the 
common value of (A) and u.*(A) is called the (Lebesgue) measure of A. 


For outer measure, we have the following analogue of Theorem 2: 


THEOREM 4, If A is any set and {A,} is a finite or countable system of 
sets such that 
Ac UA,, 
then . 
wt(A) < 5 u%(d,). 2’) 


Proof. Given any « > 0, for each A, there is a finite or countable 
system of rectangles {P,,,} such that 


A, SU Pa 
pb 


and 


DY m(P ax) < wX(A,) + oe 
k 


by the definition of outer measure. Then 
AcUUP,, 
and ee 
uA) < SE (Pp) < DuX(4,) + 
nok n 
which implies (2’), since ¢ > 0 is arbitrary. j 


Coro.iary. /f A is any measurable set and {A,} is a finite or count- 
able system of measurable sets such that 


Ac UA,, 
then . 
A) < 5 wAn). 2") 
Proof. Merely replace ».* by win (2’). ff 


Next we show that the Lebesgue measure of an elementary set coincides 
with its measure as previously defined: 


THEOREM 5. Every elementary set A < E is measurable, with Lebesgue 
measure (A) equal to the measure m(A) introduced in Definition 1. 
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Proof. Suppose A is the union of the pairwise disjoint rectangles 
Prices Py Then 


k 
m(A) = 2 m(P,), 
by Definition 1. Therefore, since the rectangles P,,...,P, obviously 
cover A, 


u*(A) < 2 m(P;) = m(A), (3) 


by Definition 2. Moreover, if {Q;} is any finite or countable system of 
rectangles covering A, we have 


mi(A) < > m(Q,) 


by Theorem 2, and hence 
m(A) < w*(A), (4) 


by Definition 2 again. Comparing (3) and (4), we get 
ma(A) = (A). 
Now E — A is also an elementary set, and hence 
m(E — A) = u*(E — A). 
But 
m(E — A) =1—m(A), 


while 
u*(E — A) =1—4,(A). 
It follows that 
m(A) = u4(A), 


and hence 
m(A) = 4(4) = u*(4). Bl 


CorOLLaryY. Theorem 2 is a special case of Theorem 4. 
Proof. Merely replace ».* by m in (2’) or uw by min (2”). 
Lemma. The inequality 
le*(A) — p(B) < w*(4 A B) (5) 
holds for any two sets A and B. 


Proof. Since 
ACBU(AAB) 


it follows from Theorem 4 that 
u*(A) < u*(B) + u*(4 A B). (6) 
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This implies (5) if u*(4) > u*(B). If u*(A) < w*(B), we deduce (5) 
from the inequality 
u*(B) < u*(A) + u*(A A B) 


obtained by interchanging the roles of A and B in (6). §j 


THEOREM 6. A set A is measurable if and only if, given any ¢ > 0, 
there is an elementary set B such that 


u*(A AB) <e,. (7) 


Proof. Suppose that given any « > 0, there is an elementary set B 
such that (7) holds. Then, by the lemma, 
[e*(A) — v*(B)| = Ie*(A) — mB) <e, (8) 
and similarly 
lu*(E — A) — mE — B)| <e, (9) 
since 
(E—A)A(E—B)=AAB. 
Bearing in mind that 
m(B) + m(E — B) = m(E) = 1, 
we deduce from (8) and (9) that 
|u*(A) — p(B — A) — 1] <2e, 
and hence that 
u*(A) + uX(E— A) = 1, (10) 


since « >O is arbitrary. But then u,(A) =u*(A), so that A is 
measurable. 

Conversely, suppose A is measurable, i.e., suppose (10) holds. Then, 
given any ¢ > 0, there are systems of rectangles {B,} and {C,,} covering 
Aand E — A, respectively, such that 


> m(B,) < w*(A) + | (11) 


> m(C,) < wX(E — A) + ; (12) 


n 


Moreover, since > m(B,) < 00, there is an N such that 
n 


Y m(B,) <=. 
n>N 3 
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We now show that (7) holds for the elementary set 


N 
B= UB,. 
Clearly, the set a 
P=UB, 
n>N 
contains A — B, while the set 
Q=U(Bnc,) 
contains B — A, and hence ’ 
AABCPUQ. (13) 
Moreover, 
uX(P) < > m(B,) <=. (14) 
n>N 3 


To estimate u*(Q), we note that 


(U B,) U (U (c= ») a 


and hence 
Y m(B,) + YMC, — B)> 1. (15) 
But (11) and (12) imply 
Y mB.) +E m(C,) < wX(A) + w(E— A+ F=14+ 2. (16) 


Subtracting (15) from (16), we get 
EY m(C,) — Y M(C, — B) = LMC, VB) <=, 
ie., 
x 2¢ 
u*(Q) < a (17) 
Finally, comparing (13), (16) and (17), we find that 
u*(A A B)< wX(PUQ)< v*(P)+ u*@Q)<e. Of 


THEOREM 7. The union and intersection of a finite number of measurable 
Sets are again measurable sets. 


Proof. It is enough to prove the theorem for two sets. Thus suppose 
A,and A, are measurable sets. Then, by Theorem 6, there are elementary 
sets B, and B, such that 


uA, AB) <5,  p*(dp A By) < 5 
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Since 
(Ay U Ag) A (By U By) & (Ay A By) U (A, A B,), 
we have 


u*[(Ay U AQ) A (By U By) < w* (Ay A By) + w*(4g A By) <e. 


But B, U B, is an elementary set, and hence A, U A, is measurable, by 
Theorem 6 again. Moreover, a set A is measurable if and only if 


u*(4) + wX(E— A) = 1, 


and hence if A is measurable, so is E — A. Therefore the measurability 
of A, M A, follows from that of A, U A, and the formula 


4,0 4,=E—[E— A) V(E— A). I 
CoroLiaRyY. The difference and symmetric difference of two measur- 
able sets are again measurable sets. ; 
Proof. An immediate consequence of Theorem 7 and the formulas 
A, — Ay = A, N(E — Ay), 
A, A A, = (Ay — Ap) U (A, — Ay). 


THEOREM 8. If A\,..., Ay are pairwise disjoint measurable sets, then 
N N 
u/ U4.) = 5 (Ay) 


Proof. As in the proof of Theorem 7, we need only consider the case 
n = 2. By Theorem 6, given any « > 0, there are elementary sets B, 
and B, such that 


u*(4, A By) <e, u*(Ay A Be) <e. (18) 
Let 
A=4,U4,, B=B,UBy 


Then A is measurable, by Theorem 7. Since A, and A, are disjoint, we 
have 
By, OB, & (Ay A By) U (Ap A B,), 


and hence 
“TAB, O Be) < 2e. (19) 
Moreover, it follows from (18) and the lemma on p. 260 that 
(By) — w*(Ay)| <e, — |(Ba) — w*(AQ)| <. (20) 


Since measure is additive on elementary sets, it follows from (19) and 
(20) that 


m(B) = m(B,) + m(B,) — m(By O Be) > w*(Ay) + w*(AQ) — 4e. 
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Noting also that 
AABC (A, A By) U(Az A B,), 
we have 


w*(A) > m(B) — w*(A A B) > M(B) — 2e > U*(Ay) + p(y) — 66. 


Therefore 
p*(A) > w*(A,) + w*(AQ), (21) 


since ¢ > 0 can be made arbitrarily small. On the other hand, it follows 
from A = A, U A, and Theorem 4 that 


u*(A) < w*(A,) + w*(Ay). (22) 
Comparing (21) and (22), we get 

w*(A) = w*(A)) + v*(A,), 
where w.* can be replaced by w, since A,, Ag, and A are measurable. §j 


THEOREM 9. The union and intersection of a countable number of 
measurable sets are again measurable sets. 


Proof. Given a countable system of measurable sets {A,}, let 


ao 
A=UA,, 
n=1 
and let 
n-1 


AV=A, A, =A,—UA, (n=2,3,...). 
k=1 
Then the sets 4}, are pairwise disjoint, and 
co 
A=UA’. 
n=1 


By Theorem 7 and its corollary, the sets A), are all measurable. More- 
over, by Theorems 4 and 8, 


N N 
Sua) = H/ U4,) < (A) 
n= n= 

for every N= 1,2,... . Therefore the series 

LH(4y) 


converges, and hence, given any « > 0, there isaninteger v > 0 such that 


Zwldn <5. (23) 
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Since the set 


is measurable, being the union of a finite number of measurable sets, 
there is an elementary set B such that 

uX(C A B)< 5: (24) 
Moreover, since 

AABC(CAB)U (UA), 
nm>yv 

it follows from (23) and (24) that 

u*(A A B) <e. 
Therefore A is measurable, by Theorem 6. Finally, since complements of 


measurable sets are themselves measurable, the intersection 


ao io] 
NA, =E—U(E—A,) 
=1 n=1 
is measurable. § : 
Theorem 9 generalizes Theorem 7 to the case of a countable number of 
measurable sets. The corresponding generalization of Theorem 8 is given by 


THEOREM 10. If A, Ag,...,A,,... are pairwise disjoint measurable 
sets, then 


fo 9) foo} 
H/ U4,) = Son): (25) 
Proof. Let 7 - 
A=UA, 
n=1 
Then, since 
N 
UA, < A 
n=1 


for every N = 1, 2,..., it follows from Theorem 8 and the corollary to 
Theorem 4 that 


Sa.) = u/ Ua,) Zia: 


Taking the limit as N > «, we get 


SoA.) < u(A). (26) 
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On the other hand, since obviously 
io) 
AcUA,, 
n=1 


it follows from the same corollary that 


u(A) < ZHMAn)- (27) 
Comparing (26) and (27), we get 

uA) = Xuan), 
or equivalently (25). §f 


The key property of the measure wu expressed by (25) is described by 
saying that p is countably additive or o-additive. 


THEOREM 11. Let {A,} be a sequence of measurable sets which is 
decreasing in the sense that 


Ay > 4,2 ++: 2A, Dees, 
Then 
lim u(A,) = 2(A), (28) 
where pres 
A= n A, 
n=1 


Proof. We need only consider the case A = @, to which the general 
case reduces if A, is replaced by A,, — A. Clearly 
A, = (A; — Ag) U (Ag — As) Ue -,—, 
and 
A, = (An — Anta) V (Anti — Angas) Ure 


Therefore, by the c-additivity of u, 


wy) = S ude — Agar) (29) 
and 
wn) = S wlde — Ane, (30) 


Since the series (29) converges, its remainder (30) approaches 0 as n — oo. 
It follows that 


limy(4,)=0="(2). W 


no 
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CoroLiary. Let {A,} be a sequence of measurable sets which is in- 
creasing in the sense that 


A,SoAZG tt CALS . 
Then 
lim (A,) = (A), (28’) 
no 
where 
A=UA,. 
n=1 


Proof. Apply Theorem 11 to the complements of the sets 4,. jj 


The property of the measure u, expressed by (28) and (28’) is described 
by saying that p. is continuous. 


Remark 1. To recapitulate, starting from a measure m defined on the 
class of all rectangles (with sides parallel to the coordinate axes), we 
have succeeded in extending m first to a measure m defined on the larger 
class -%, of all elementary sets and then to a Lebesgue measure pu. defined 
on the still larger class %, of all measurable sets. The class -% is closed 
under the operations of taking countable unions and intersections. Moreover, 


the measure p is c-additive on %. 


Remark 2. So far we have required all our sets to be subsets of the closed 
unit square 
E=({(x,y:i0<x<1,0<y< 1}. 


It is easy to get rid of this restriction. For example, representing the whole 
plane as the union of the squares 


Emn = {(x,y)im<x<m+i,n<y<n+ 1}, 


where m and 7 are arbitrary integers, we say that a plane set A is measurable 
if its intersection A,,, = A Q En, With every square E,,,, is measurable as 
previously defined and if the series 


X H(Amn) 
converges. The measure of A is then defined as 


H(A) = 3 U(Amn)- (31) 


m,n 


All the properties of measure proved above carry over to this more general 
case in a straightforward way (give the details). 


Remark 3. We might go still further, calling a set A measurable with 
“infinite measure” if every A,,, is measurable and if the series (31) diverges. 
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Alternatively, we can regard the whole plane as the union of the squares 
E, ={((x, yin <x<n,—n<y <n}, 


calling a plane set measurable, with (possibly infinite) measure 
u(A) = lim u(A,) (32) 
no 


if its intersection A, = A OE, with every square E, is measurable as 
previously defined. As an exercise, prove the consistency of (31) and (32). 


Problem 1. Let E be the closed unit square. Prove that 


a) Every open subset of E is measurable; 

b) Every closed subset of E is measurable; 

c) Every set obtained from open and closed subsets of E by forming no 
more than a countable number of unions, intersections and com- 
plements is measurable. 


Comment. There are measurable subsets of E which are not of the type c). 


Problem 2, Construct a theory of Lebesgue measure for sets on the line, 
starting from intervals (closed, open and half-open) instead of rectangles. 
Do the same for 


a) Sets on the circumference of a circle; 
b) Three-dimensional sets; 
c) Sets in R”. 


Problem 3. Prove that the set of all rational points on the line is measur- 
able, with measure zero. 


Problem 4. Prove that the Cantor set constructed in Example 4, p. 52 
is measurable, with measure zero. 


Problem 5. Prove that every set of positive measure in the interval [0, 1] 
contains a pair of points whose distance apart is a rational number. 


Problem 6. Show that the power of the set of all measurable subsets of 
the interval [0, 1] is greater than the power of the continuum. 


Problem 7. Let C bea circle of circumference 1, and let « be an irrational 
number, Let all points of C which can be obtained from each other by 
rotating C through an angle nan (where n is any integer, positive, negative 
or zero) be assigned to the same class. (Clearly, each such class contains 
countably many points.) Let ® be any set containing one point from each 
class. Prove that ®, is nonmeasurable. 
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Hint. Let ®, be the set obtained by rotating ®, through the angle nar. 
Then 


c=U ®,, 
n=—o0 


and 
®,, A®, = @ (m # n). 


If ®, were measurable, the congruent sets ®, would also be measurable. 
This would imply 

by the o-additivity of u. But congruent sets must have the same measure, 
ie., if Oy were measurable, then 


u(®,) = uo), 
which contradicts (33). 


26. General Measure Theory 


26.1. Measure on a semiring. In Sec. 25 we constructed a theory of 
measure of plane sets, starting from a measure (area) m defined on the class 
Y,, of all rectangles (with sides parallel to the coordinate axes) and then 
extending m to a Lebesgue measure yu. defined on the much larger class 
of all measurable sets. The explicit formula for the area of a rectangle played 
no role in this construction. In fact, a moment’s thought shows that we only 
used the following properties of the set function m: 


1) The domain of definition %, of m, i.e., the class of all rectangles, 
is a semiring;t 
2) m is real and nonnegative; 
3) m is additive in the sense that if P is a rectangle such that 
P=UP,, 
k=1 


where P,,..., P,, are pairwise disjoint rectangles, then 


n 
m(P) = 2 mn(Pi). 
As will be shown in this section and the next, the construction given in 


Sec. 25 for the case of plane sets can be carried out in an abstract setting, 
whose very generality greatly enhances its range of applicability. 


1 We now draw freely from the material in Sec. 4, on systems of sets. 
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Guided by the above properties of m, we introduce 
DEFINITION 1. A set function (A) is called a measure if 


1) The domain of definition %, of y. is a semiring; 
2) w is real and nonnegative; 
3) w is additive in the sense that if A is a set in SY, such that 


A=UA,, 


k=l 


where A,,..., A, are pairwise disjoint sets in S,, then 


WA) = 3 war. 
Remark. It follows from @ = @ U © that 


u(S) = 2u(S), 
and hence 


u(s) = 0. 


THEOREM 1. Let . be a measure on a semiring S, and suppose the 


sets A, Ay,...,A,n, where Ay,..., A, are disjoint subsets of A, all belong 
to F,. Then 


2, wAe) < H(A). 
Proof. By Lemma |, p. 33, there is a finite expansion 


$s 
A=UA, (s>7n) 
k= 
with 4,,..., A, as its first n terms, where 
A,€ &,, A, NA, = SB kAD 
for allk,/=1,2,... . Hence 


2, u(A;,) Po u(A;,) = u(A), 
since w is nonnegative and additive. jj 


THEOREM 2. Let u be a measure on a semiring SY, and suppose the 
sets A, A,,..., A, all belong to FY, and satisfy the condition 


ACUA,. 
k=1 
Then 


n 


u(A) < > u(A,). 


k=1 
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Proof. According to Lemma 2, p. 33, there is a finite system of 
pairwise disjoint sets B,,..., B, belonging to %, such that each of the 


sets A, A,,..., A, has a finite expansion 
A=UB, A4,=UB, (k=1,...,n) 
sel séeMy 


with respect to certain of the sets B,, where each index s € M, belongs to 
at least one of the sets M4, (recall footnote 16, p. 33). Hence each term 
in the sum 


> u(B,) 
seMo 


appears at least once in the double sum 


x > u(B,) 
k=1 sé€M, 


It follows that 
w(4)= > u(B)<Y YF wB)=LulA,. fl 
seMo k=1 sell, k=1 


Coroiiary. If A < A’, then p(A) < p(A’). 
Proof. Choosen=1. § 


It will be recalled that the first step in constructing Lebesgue measure of 
plane sets was to extend measure from rectangles to elementary sets, i.e., to 
finite unions of disjoint rectangles. We now consider the abstract analogue 
of this process: 


DEFINITION 2. A measure u. is called an extension of a measure m if 
SF, © F, and u(A) = m(A) for every Ae &,,. 


THEOREM 3. Any measure m defined on a semiring S,, has a unique 
extension p defined on the ring B(S,), i.e., the minimal ring generated 
by F,. 


Proof. By Theorem 3, p. 34, every set Ae &(K,) has a finite 
expansion z 
A= UB, (1) 
k=1 


where the sets B,,..., B, are pairwise disjoint and belong to &%,. Let 
w(A) = > m(B,). @) 


Then uw is obviously real, nonnegative and additive. Moreover, the 
quantity (A) defined by (2) is independent of the expansion (1). In fact, 
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suppose A has another expansion of the form 
§ 
A=UC, (1') 
t=1 


where the sets C,,... , C, are pairwise disjoint and belong to %,. Then, 
since the intersections B, © C, all belong to &%,, it follows from the 
additivity of the measure m that 


Ene) “32 m(B, OC) -> mC,), 


and hence 
> m(C,) = (A), 


as asserted. This proves the existence of the extension u. To prove the 
uniqueness of »., suppose m has another extension wu’, and let A be the 
set (1). Then, by the additivity of y’, 


HA) =S eB) = S mB) = v4). 


Hence, since every set A € #(X%,) has a representation of the form (1), 
the extensions » and uw’ coincide. ff 


Remark. As already noted, the proof of Theorem 3 is a repetition in 
abstract language of the extension of measure from the semiring of rectangles 
to the minimal ring generated by this semiring, i.e., the class of elementary 
sets. 


26.2. Countably additive measures. Many problems in analysis involve 
unions of countably many sets, as well as unions of only finitely many sets. 
Correspondingly, the (finite) additivity imposed on measures in Definition 1 
turns out to be inadequate, and it is natural to introduce a stronger kind 
of additivity: 


DEFINITION 2. A measure y. with domain of definition F, is said to be 
countably additive or o-additive if 


u(A) = > u(A,) 
n=1 
for all sets A, Ay,..., An... € SF, satisfying the conditions 


A=UA,, A,N4;=2 (i¥/). 
n=1 


Example. According to Theorem 10, p. 265, Lebesgue measure in the 
plane is c-additive. 
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THEOREM 4. Suppose a o-additive measure m on a semiring FS, is 
extended to a measure y. on the ring R(Y,,). Then yp. is also c-additive. 


Proof. Suppose 


AcE AY), BERS) n=1,2,...) 
and 


where 
B, OB, = 2 (k#D. 


Then, by Theorem 3, p. 34, there exist finite expansions 


A=UA;, B, = U Bais 
3 a 


where 
A,NA,= 2, By AB = S (k AD. 
Let 
Cus = By O A;. 


Then the sets C,,,; are pairwise disjoint and 


A,;=U U Cus, 
nit 


Bas = U Cris: 
a 
Therefore 
m(A,) ss >. > MCrea)s (3) 
m(B,;) = ba mM(Cris)s (4) 
3 
since m is o-additive on -%,, and moreover 
u(A) = p m(A;), (5) 
v(B,,) = a M(Bni)s (6) 


a 


by the definition of the measure u. Comparing (3)~(6), we find that 
WA) = J mA) = TTT m(Cou) = FL MB) = THB) 


7 


(the sums over i and j are finite, while those over n are convergent). ff 
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Next we generalize Theorems | and 2 to the case of o-additive measures: 


THEOREM 1’. Let u be a o-additive measure on a semiring S,, and 


suppose the sets A, Ay,..., Ay,..., where Ay,..., Ay... are pairwise 
disjoint subsets of A, all belong to S,. Then 
> w(Ae) < (A). (7) 


Proof. By Theorem 1, 
ZA < H(A) 


for alln =1,2,... . Taking the limit as n > 0, we get (7). Jj 


THEOREM 2’. Let yp. be a c-additive measure on a semiring SF, and 
suppose the sets A, Ay,...,A,,..- all belong to S, and satisfy the 
condition ‘ 

Ac UA,. 
k=1 
Then 
WA) < Su(A). (8) 


Proof. By Theorem 4, we can assume that u is defined on the ring 
A(S,), instead of just on the semiring -¥,. In fact, if w is o-additive, 
so is its extension on #(.X%), which we continue to denote by uw, and the 
validity of (8) on &(F,) obviously implies its validity on %. The sets 

nl 
B, = (A a) A,) -UA, 
k=1 
belong to #(%,) and clearly satisfy the conditions 
A=UB,, B,°C Ay B,NB,= 2 (k#)). 
=1 
Therefore . 


m(A) = ¥ m(B,) < Sm(4,). 


Problem 1. Let X = {x,, X2,...} be any countable set, and let p,, po, .. 
be positive numbers such that 


> Pn = 1. 
n=1 
On the set SY, of all subsets of X, define a measure p. by the formula 
w(A)= Xp, (Ac X), 
aned 


where the sum is over all n such that x, ¢ A. Prove that p is a o-additive 
measure, with u(X) = 1. 
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Comment. This kind of measure arises quite naturally in many problems 
of probability theory. 


Problem 2. Let X be the set of all rational points in the closed unit 
interval [0, 1], and let .%, be the set of all intersections of the set X with 
arbitrary closed, open and half-open subintervals of [0, 1], including the 
degenerate closed intervals consisting of a single point. Prove that Y%, is a 
semiring. Define a measure » on &, by the formula 


(Ago) =b—a, 
where A,, is the intersection of X with any of the intervals [a, b], (a, 5), 
(a, 6], [a, 6). Prove that u is additive, but not o-additive. 


Hint. Although u(X) = 1, X is a countable union of single-element sets, 
each of measure zero. 


Problem 3. Let uw be a measure which is additive, but not c-additive. 
Prove that 


a) Theorem 1’ continues to hold for y; 
b) Theorem 2’ fails to hold for yu. 


Hint. Use Problem 2. 


Problem 4. Given a measure y. on a semiring %,, suppose 


p(A) < Sua) 


whenever the sets A, A,,...,A,,... all belong to FS, and satisfy the 
condition 


AcUA,. 
k=1 
Prove that u is o-additive. 


Comment. It is often easier to verify that w has this property than to 
prove the o-additivity of wu directly. 


27. Extensions of Measures 


Any measure m defined on a semiring %,, can be extended to a measure 
defined on the ring &(.Y%,), i.e., the minimal ring generated by 4%. How- 
ever, if m is c-additive, we can extend m to a measure defined on a much 
larger class of sets than #(.%,). This is done by the abstract analogue of 
the procedure used in Sec. 25.2 to construct Lebesgue measure in the plane. 
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Assuming that %, has a unit,” we begin with the analogues of Definitions 
2-5, pp. 259-260. 


DEFINITION 1. Let m be a o-additive measure on a semiring SF, with 
a unit E. Then by the outer measure of a set A < E is meant the number 


u*(A) = inf > m(B,), 
ACUBLE 
& 


where the greatest lower bound is taken over all coverings of A by a finite 
or countable system of sets B, € F,. 


DEFINITION 2. By the inner measure of a set A < E is meant the 
number 


(4) = m(E) — uX(E — A). 
Remark. By the exact analogue of Theorem 3, p. 258, it follows that 
ta(A) < 2*(A). 
DEFINTION 3. A set A is said to be (Lebesgue) measurable if 
Ux(A) = 2*(A), 
i.e., if its inner and outer measures coincide. 


DEFINITION 4. Ifa set A is measurable, the number (A) equal to the 
common value of ,(A) and w.*(A) is called the Lebesgue measure of A.* 


Remark. Clearly, a set A < E is measurable if and only if 
w*(A) + w*(E — A) = m(E). () 
In particular, it follows from (1) that if A is measurable, so is E — A. 


THEOREM |. If A is any set and {A,} is any finite or countable system 
of sets such that 
Ac UA,, 
then ‘ 
uA) < Zer(An). 


Proof. Exactly analogous to that of Theorem 4, p. 259. Jj 


2 The case where %, fails to have a unit will be discussed later (after Theorem 7). 

3 It turns out, of course, that u is a measure as defined in Sec. 26.1 (see Theorem 5, 
where the additivity of u is proved). In particular, this justifies the use of the notation 
S, for the system of all measurable sets. 
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THEOREM 2. Every set AG A(F,) is measurable, with Lebesgue 
measure equal to m(A), where m is the extension of m from the semiring 
SF, to the ring B( F,). 


Proof. Exactly analogous to that of Theorem 5, p. 259. Jj 


THEOREM 3. A set A is measurable if and only if, given any « > 0, 
there is a set BE &(&,,) such that 


u*(A AB) <e, 
Proof. Exactly analogous to that of Theorem 6, p. 261. Jj 


THEOREM 4. The system S, of all measurable sets is a ring. 


Proof. Exactly analogous to that of Theorem 7, p. 262 and its 
corollary. § 


Remark. Obviously E is the unit of /,, so that % is an algebra of 
sets (see p. 31). 


THEOREM 5. The set function y(A) is additive on S.. 

Proof. Exactly analogous to that of Theorem 8, p. 263. j 
THEOREM 6. The set function u(A) is o-additive on F,. 
Proof. Exactly analogous to that of Theorem 10, p. 265. ff 


Remark. Thus u is a o-additive measure of the system -%, of all measur- 
able sets. This measure is called the Lebesgue extension of the original 
measure m. 


THEOREM 7. The system FS, of all measurable sets is a Borel algebra 
with unit E. 


Proof. Recall from p. 35 that a Borel algebra is closed under the 
operations of taking countable unions and intersections. The proof is 
the exact analogue of that of Theorem 9, p. 264. Jj 


It is interesting to note that an arbitrary measurable set can be approxi- 
mated to within a set of measure zero by a set of a very special kind: 


THEOREM 8. Given any set A € &%,, there are sets 
Buz © RFn) (Bar © Bos S++ S By, S °°") 
and corresponding sets 


we 


B,=UB,€ F (B, > B, > ++: > B,>°°') 
k 
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such that 
A<B=f)B,, 
u(A) = u(B). 
Proof. Given any n, we can cover A by a union 
C, = U Aur 


of sets A, € 4%, such that 
1 


Let 
n 
B, = N Crs 
k=1 
so that, in particular, B) > B, >-+-:> B,>-+--. Then it is easy to 
see that 
B, = U Sass 
where 8,,,€ %,. Next let : 
k 
Bur inal U Sn 
s= 
so that, in particular, 
B, = UB.,,. 
k 


Then obviously B,,¢ A(S%,) and By © By cs? CO By Se 
Moreover 


Ac B=f)B,, 


n 


since B is an intersection of sets containing A. It follows that 
u(A) < p(B). (2) 
On the other hand, B < B, < C, for every n, and therefore 


p(B) < WB.) < w(Ca) < eA) + = 
Taking the limit as n + oo, we get 
u(B) < p(A), 
which, together with (2), implies u(A) = p(B). fi 


Our construction of the Lebesgue extension of a measure m defined on a 
semiring %, must be modified somewhat if %, fails to have a unit. We 
continue to use Definition 1 to define the outer measure p*, but 2* is now 
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defined only on the system Y,, of all sets with coverings 
U B, (By € Fra) 
k 
such that 
> m(B,) < 0. 
k 


Since Definition 2 is meaningless in the absence of a unit, we now define 
measurable sets by using the property figuring in Theorem 3: 


DEFINITION 3’. A set A is said to be (Lebesgue) measurable if, given 
any « > 0, there isa set Be B(S,) such that p*(A A B) <e. 


DEFINITION 4’. If a set A is measurable, the number (A) equal to 
its outer measure *(A) is called the (Lebesgue) measure of A. 


Remark. Note that Definitions 3’ and 4’ are equivalent to Definitions 3 
and 4 if 4%, has a unit. 


In the case where %, has no unit, Theorems 4-6 continue to hold, since 
the proofs of Theorems 5 and 6 do not require %, to have a unit, while the 
proof of Theorem 4 can easily be freed of this requirement (see Problem 4). 
However, Theorem 7 now takes a new form (see Problem 5). As before, the 
a-additive measure y on the system -¥, of all measurable sets is called the 
Lebesgue extension of the original measure m. 


Remark. There is an interesting analogy between the construction of the 
Lebesgue extension of a measure m defined on a semiring -%, and the process 
of completing a metric space. Let m be the extension of m from the semiring 
f,, to the ring A(F,), and suppose we regard m(A A B) as the distance 
between the elements A, Be &(.Y%,). Then #(.F,) becomes a metric space 
(in general, incomplete), whose completion, according to Theorem 3, is just 
the system -%, of all Lebesgue-measurable sets. However, note that from a 
metric point of view, two sets A, Be % are indistinguishable if u(A A B) = 0. 


Problem 1. Let m be a c-additive measure on a semiring %, with a unit 
E, let ». be the Lebesgue extension of m, and let @ be an arbitrary o-additive 
extension of m. Prove that 2(4) = u(A) for every measurable set A on 
which @ is defined. 


Hint. First show that p,.(4) < 2(A) < *(A). 


Problem 2. Let m be the same as in the preceding problem, and let 7 be 
the extension of m to a measure defined on &(.Y%,). Prove that the outer 
measure of a set A < Eis given by 


u*(A) = inf > (By) 
ACUB;, 
k 
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where the greatest lower bound is taken over all coverings of A by a finite 
or countable system of sets B, € B(Y,). 


Problem 3. State and prove the analogues of Theorem 11, p. 266 and its 
corollary for an arbitrary o-additive measure py defined on a Borel algebra 
SF, with unit £. 


Problem 4. Give a proof of Theorem 7 valid in the case where ¥, fails 
to have a unit. 


Hint. Suppose A,, A, € %. Then A, U A,€ &%, by the same proof as 
before (cf. p. 262). Moreover, there are sets B,, B, € A(S,) such that 


wd SBi)<5,  w*(do dB) <5. 
But 
(A, — Az) A (By — By) < (Ay A By) U (Ag A Ba), 
and hence 1*(A A B) < «where B = B, — B,€ &(Y,). Therefore A, — Ap 
e &. To prove that A, M A, and A, A A, belong to %,, use the formulas 
A, O Ap = A, — (A, — Ad), 
Ay A Ay = (A, — Ap) U (Az — Ay). 

Problem 5. Given a measure m on a semiring %, with no unit, let u 
be the Lebesgue extension of m and .%, the corresponding system of all 
measurable sets. Prove that 

a) &, is a d-ring (see p. 35); 
b) The set 
A=UA, (4,€ 4%) 
k 


belongs to XY if and only if there is a constant C > 0 such that 
n 
H/ UA :) <C @) 
k=l 
for alln =1,2,... 


Comment. The necessity of the condition (3) is obvious, since our 
measures are always finite. 


Problem 6. Let » and SY, be the same as in the preceding problem. 
Prove that the system of all sets Be S, which are subsets of a fixed set 
Ae & isa Borel algebra with unit A. 


Problem 7. A measure v. is said to be complete if every subset of a set 
of measure zero is measurable, i.e., if A’ < A, u(A) =0 implies A’ e &%. 
(If A’ € %,, then obviously (A’) = 0.) Prove that the Lebesgue extension 
of any measure m is complete. 
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Hint. If A‘ < A and p(A) = 0, then u*(A’)) = 0. But @ € A(Y) and 
ur(A’ AB) = y*(4’) = 0. 

Problem 8. Let ™ be a measure defined on a ring &. For example, m 
might be the extension of a measure m originally defined on a semiring %, 
to a measure defined on the minimal ring # = A(Y,) generated by %. 


Then a set A is said to be Jordan measurable if, given any « > 0, there are 
sets A’, A” € & such that 


ACACA", MA"-A)<e, 
Prove that the system &* of all Jordan-measurable sets is a ring containing 
&, 


Problem 9. Let m, # and &* be the same as in the preceding problem, 
and let . be the system of all sets A such that there is a set B € Z& containing 
A. Given any set A €.%, let 


u(A) = inf m(B), 
BDA 
Bek 
(A) = sup m(B) 
= BCA 
Be® 
(since @ < A, A always contains a set in Z). Prove that 
a) u(A) < u(A); 
b) The ring #* coincides with the system of all sets A €.W for which 
u(d) = (A); 
c) If 
A = U Ags 
k=1 
where A, A;,..., A, all belong to .%, then 
u(A) < 2 #(4;); 


d) If A,,..., A, are pairwise disjoint sets contained in a set A, then 


wld) > ¥ w4y) 


By the Jordan measure of a set A € #*, we mean the number p(A) equal to 


the common value of (A) and @(A). Prove that wis a measure on 2* = JF. 


Comment. The measure py is called the Jordan extension of the measure 
m. If mis itself an extension of a measure m originally defined on a semiring 
LF, we write Z* = Z*( SF.) and call wu the Jordan extension of the measure 
m, as well as of the ‘“‘intermediate’’ measure 7. 
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Problem 10. Given two measures 77, and m, defined on rings #, and &,, 
let 4, and yu, be their Jordan extensions onto the larger rings AE = SF, and 
RF = &,,, Prove that p, and pe coincide if and only if 


A, <= F,, H(A) = v.(A) for all Ac &,, 


U2? 
BR, > F,, mA) = u,(A) for all A € Z,. 

Problem 11. Let m be the measure defined in Sec. 25.1 on the ring & of 
all elementary sets (i.e., all finite unions of disjoint rectangles with sides 
parallel to the coordinate axes), and let » be the Jordan extension of m. 
Prove that » does not depend on the particular choice of the underlying 
rectangular coordinate system. In other words, prove that u (as well as 
the corresponding ring #* = ,) does not change if all the sets in # are 
subjected to the same shift and rigid rotation. 


Problem 12. We say that a set A is a set of uniqueness for a measure m if 


1) There is an extension of m defined on A; 
2) If w, and uw, are two such extensions, then u,(A) = p(A). 


Prove that the system of sets of uniqueness of a measure m defined on a 
semiring %, coincides with the ring Z* = #*( YX.) of sets which are Jordan 
measurable (with respect to m). In other words, prove that the Jordan ex- 
tension of a measure m originally defined on a semiring %, is the unique 
extension of m to a measure defined on #* = Z*(LF,), but that the 
extension of m to a larger system is no longer unique. 


Problem 13. Prove that if a set A is Jordan measurable, then 


a) A is Lebesgue measurable; 
b) The Jordan and Lebesgue measures of A coincide. 


Prove that every Jordan extension of a o-additive measure is c-additive. 
Problem 14, Give an example of a set which is Lebesgue measurable, but 
not Jordan measurable. 


Problem 15. We say that a set A is a set of o-uniqueness for a o-additive 
measure m if 

1) There is a o-additive extension of m defined on A; 

2) If wy and yw, are two such extensions, then u,(A) = pe(A). 
Prove that the system of sets of c-uniqueness of a o-additive measure m 
defined on a semiring 4%, coincides with the system of sets which are 
Lebesgue measurable (with respect to m). 


Hint. To show that every Lebesgue-measurable set A is a set of o- 
uniqueness for m, choose any « > 0. Then there is a set BE 2 = &(LF,) 
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such that u*(4 A B) <c«. If wis any extension of m defined on A (and on 
&), then u(B) = m(B), where m is the unique extension of m onto &. 
Moreover, (A A B)< p*(A A B)<e, and hence |y(A) — m(B)| <«. 
Therefore [u,(A) — y2(A)| < 2 if uw, and uw, are two o-additive extensions 
of m defined on A (and on &). Hence y,(A) = up(A), by the arbitrariness 
of «. 


Problem 16, Let m be a o-additive measure defined on a semiring %,, 
and let # be the domain of the Lebesgue extension of m. Let m’ be a o- 
additive extension of m to a semiring %,, such that 


SF, 5 Sy c FL, 


and let Y’ be the domain of the Lebesgue extension of m’. Prove that 
L' = Lf. 


8 


INTEGRATION 


28. Measurable Functions 


28.1. Basic properties of measurable functions. Given any two sets X and 
Y, let SY be a system of subsets of X and /’ a system of subsets of Y. Then 
an abstract function y = f(x) defined on X and taking values in Y is said 
to be (FY, S’)-measurable if A€ S’ implies f-1(A) € S. 


Example. Let X and Y both be the real line R, so that y = f(x) is a 
“function of a real variable.’’ Moreover, let Y and S’ both be the system 
of all open (or closed) subsets of R'. Then our definition of measurability 
reduces to that of continuity (recall Sec. 9.6). On the other hand, if we 
choose both Y and #’ to be the system @! of all Borel sets on the real line 
(recall p. 36), our definition becomes that of a Bore/-measurable (or simply 
B-measurable) function. 


In what follows, we will be primarily concerned with the notion of real 
functions measurable with respect to some underlying measure p., this being 
the case of greatest interest from the standpoint of integration theory. More 
exactly, let X be any set and Y the real line R', with Y = YF, the domain of 
definition of some o-additive measure » and Y’ the system @! of all Borel 
sets B < R*. For simplicity, we assume that %, has a unit equal to X itself. 
Moreover, since any o-additive measure can be extended onto a Borel algebra 
(by Theorem 7, p. 277), we might as well assume from the outset that %, 
is a Borel algebra. These considerations suggest 


DEFINITION 1. Given a o-additive measure v. defined on a Borel algebra 
S, of subsets of a set X, where X is the unit of F,, let y = f (x) be a real 
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function defined on X, and let B' be the set of all Borel sets on the real 
line. Then the function f is said to be p-measurable (on X) if f(A) € F, 
for every A & &", or equivalently iff (ZB) < &, 


THeorEM 1. A function f is w-measurable if and only if the set 
{x:f(x) < c} is u-measurable (i.e., belongs to F,) for every real c. 


Proof. If fis u-measurable, then obviously so is {x:f(x) < c}, since 
(— o, c)is a Borel set. Conversely, let & be the system of all semi-infinite 
intervals (—co, c), and suppose f-1(Z) ¢ %. Since A(X), the Borel 
closure of & (see p. 36), coincides with the system #1 of all Borel sets 
on the line (why?), we have 


SMU) =f MUBQ) = BFA) © BS) 


(recall Problem 3e, p. 36). But A(X) = F&,, since SF, is a Borel 
algebra, and hence 


SUAVE SF. I 


THEOREM 2. Let {f,,} be a sequence of -measurable functions on X, 
and let f be a function on X such that 


for every x € X. Then f is itself u-measurable. 
Proof. First we verify that 


{xvf(x) <e} =UUN xfs) <e— jh (1) 
konm>n 
In fact, if f(x) < c, there is an integer k > 0 such that 
2 
x)<e—--, 
ff) i 
and then for this k, there is an integer n > 0 so large that 
1 
Snlx) <¢ — k (2) 
for all m > n. Therefore every x belonging to the left-hand side of (1) 
also belongs to the right-hand side. Conversely, if x belongs to the 
right-hand side of (1), there is a k such that (2) holds for all sufficiently 


large m. But then f(x) < c, i.e., x belongs to the left-hand side of (1). 
Now, since the functions f,, are u-measurable, the sets 


, _1 
xfnlx) << A 
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all belong to %,, and hence so does the right-hand side of (1), since %, 
is a Borel algebra. Therefore {x:f(x) <c}e %. But then f is p- 
measurable, by Theorem 1. § 


THEOREM 3. A B-measurable function of a u.-measurable function is 
itself j2-measurable. 


Proof. Let f(x) = [v(x], where ¢ is B-measurable and ¢ is p- 
measurable. If A < R'is any B-measurable set, then its preimage 4’ = 
-1(A) is B-measurable, and hence the preimage A” = (—1(A’) is y- 
measurable. But A” = f—1(A), and hence fis u-measurable. Jj 


COROLLARY. A continuous function of a y-measurable function is 
itself u-measurable. 


Proof. A continuous function is clearly B-measurable. j 


28.2. Simple functions. Algebraic operations on measurable functions. 
A function f is said to be simple if it is u-measurable and takes no more 
than countably many distinct values. This notion clearly depends on the 
choice of the measure p.. 


The structure of simple functions is clarified by 


THEOREM 4. A function f taking no more than countably many distinct 
values y,, Yo,... is u-measurable if and only if the sets 


A, = {x:f(x) = yn} (n=1,2,...) 
are y-measurable. 


Proof. Since each single-element set {y,,} is a Borel set, the set A,, 
being the preimage of {y,,}, is measurable if fis measurable.1 Conversely, 
suppose the sets A,, are all measurable. Then the preimage f—1(B) of any 
Borel set B < Ris measurable, being a union 

U A, 

Une B 
of no more than countably many measurable sets A,. But then f is 
measurable. Jj 


The relation between measurable functions and simple functions is shown by 


THeorREM 5. A function f is u-measurable if and only if it can be 
represented as the limit of a uniformly convergent sequence of simple 
functions. 


‘For simplicity, we often say ‘“‘measurable” instead of ‘‘u-measurable,” omitting 
explicit reference to the underlying measure uw. 
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Proof. If f is the (uniform) limit of a convergent sequence of simple 
functions, then fis u-measurable by Theorem 2, since simple functions 
are w-measurable by definition. Conversely, given any u-measurable 
function f, let 


m 


fa=" aa 
n 


if PM cf<Ze, 
n n 


where m and n are positive integers. Then the functions f, are simple 
and moreover converge uniformly to fas n — oo, since 


If) — fil) < - I 


The next few theorems show that the class of measurable functions i 


closed under the usual algebraic operations. 
THeEoreEM 6. If f and g are measurable, then so is f + g. 


Proof. First let f and g be simple functions, taking value y,, ye,... 
and 2, Z,,..., respectively. Then the sum / = f+ g can only take the 
values c;; = y; + Z;, where each such value is taken on a set of the form 


{x: h(x) = ¢,;} ss O ({x sf) = yi} O {02 9(x) = z,}). (3) 


There are no more than countably many values w of the function A = 
f+, and moreover each set {x: A(x) = ¢,;} is measurable, since the 
right-hand side of (3) is clearly measurable. Therefore h = f+ g is a 
simple function. 

Now let f and g be arbitrary measurable functions, and let {/,} and 
{gn} be sequences of simple functions converging uniformly to f and g, 
respectively, as in the proof of Theorem 5. Then the sequence of simple 
functions {f,, + g,} converges uniformly to f+ g, and hence f+ g is 
measurable, by Theorem 5. Jj 


THEOREM 7. If f is measurable, then so is cf, where c is an arbitrary 
constant. 


Proof. Obviously, the product of a simple function and a constant is 
again simple. But if {7} is a sequence of simple functions converging 
uniformly to f, then {cf,} converges uniformly to ef, and hence ef is 
measurable, by Theorem 5. Jj 


TueoreM 8. If f and g are measurable, then so is f — g. 
Proof. An immediate consequence of Theorems 6 and7. 


TueEorem 9. If f and g are measurable, then so is fg. 
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Proof. Clearly, 
1 
Sg = 7+ sP — F— 8)". 


But the expression on the right is a measurable function, by Theorems 
6-8 and the fact that the square of a measurable function is measurable 
(this follows from the corollary to Theorem 3). § 


THEOREM 10. If f is measurable, then so is \[f, provided f does not 
vanish. 


Proof. We have 
(x re 2 | = [feo > ; U {x:f(x) < 0} 


ifc > 0, 
a5 <<} = (=: 4 <se <0] 


(x ae < c| = {xi f(x) < c} 


if c = 0. But in each case the set on the right is measurable. §j 


ife < 0, and 


CoroLary. If fand g are measurable, then so is f[g, provided g does 
not vanish. 


Proof. An immediate consequence of Theorems 9 and 10. jj 


28.3. Equivalent functions. The values of a function can often be ne- 
glected on a set of measure zero. This suggests 


DEFINITION 2. Two functions f and g defined on the same set are said 
to be equivalent (with respect to a measure w) if 


wlxf() # g(a} = 0. 


A property is said to hold almost everywhere (on E) if it holds at all points 
(of E) except possibly on a set of measure zero. Thus two functions f and g 
are said to be equivalent (written f~ g) if they coincide almost everywhere. 


THEOREM 11. Given two functions f and g continuous on an interval E, 
suppose f and g are equivalent (with respect to Lebesgue measure y. on the 
line). Then f and g coincide, 


Proof. Suppose f(x) 4 g(%) at some point x, € E, so that f(x») — 
g(%) # 0. Since f — g is continuous, there is a neighborhood of x 
(possibly one-sided) in which f — g is nonzero. This neighborhood has 
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positive measure, and hence 


u{x:f(x) # g(x)} > 0, 
Le., f and g cannot be equivalent, contrary to hypothesis. j 


Remark, Thus two continuous functions cannot be equivalent if they 
differ at even a single point. However, discontinuous functions can obviously 
be equivalent without being identical. For example, the Dirichlet function 


1 if x is rational, 


fQ)= - 


if x is irrational 


is equivalent to the function g(x) = 0 (recall Problem 3, p. 268). 


THEOREM 12. A function f equivalent to a measurable function g is 
itself measurable. 


Proof. It follows from Definition 2 that the sets {x:f(x) < c} and 
{x:g(x) < c}can differ only by a set of measure zero. Hence if the second 
set is measurable, so is the first set. The proof is now an immediate 
consequence of Theorem 1. J 


28.4. Convergence almost everywhere. Since the behavior of measurable 
functions on sets of measure zero is often unimportant, it is natural to 
introduce the following generalization of the ordinary notion of convergence 
of a sequence of functions: 


DEFINITION 3. A sequence of functions {f,(x)} defined on a space X 
is said to converge almost everywhere to a function f(x) if 


lim f,(x) = f(x) (4) 
for almost all x € X, i.e., if the set of points for which (A) fails to hold is 


of measure zero. 


Example. The sequence {f,(x)} = {(—x)"} defined on [0, 1] converges 
almost everywhere to the function f(x) = 0, in fact everywhere except at the 
point x = 1. 


Theorem 2 now has the following generalization: 


THEOREM 2’. Let {f,} be a sequence of u-measurable functions on X, 
and let f be a function on X such that 


Sf) = lim f,0) (5) 
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almost everywhere on X. Then f is itself p-measurable, provided y is 
complete.* 


Proof. If A is the set on which (5) holds, then u(X — A) = 0. The 
function f is measurable on A, by Theorem 2, and also on X — A, since 
every function is measurable on a set of measure zero if » is complete 
(why ?). Hence fis measurable on the whole set X= A U(X — A). ff 


28.5. Egoroy’s theorem. The following important theorem shows the 
relation between the concepts of convergence almost everywhere and uniform 
convergence: 


THEOREM 12 (Egorov). Let { f,,} be a sequence of measurable functions 
converging almost everywhere on a measurable set E to a function f. Then, 
given any 8 > 0, there exists a measurable set E, < E such that 


1) u(y) > uE) — 8; 
2) {f,} converges uniformly to f on Es. 


Proof. The function fis measurable, by Theorem 2’. Let 
m 1 
Em =N [»: Lf — FOO] < te (6) 
i>n m 
Thus, for fixed m and n, E™ is the set of all points x such that 

1 
(xXx) — f(x), << 
fix) — FO) =e 


holds for all i > n. Moreover, let 


E” = UE". 
n=l 
It follows from (6) that 
EPCEP C++: CEP C:--, 


and hence, by the corollary to Theorem 11, p. 267,° given any m and 
any 5 > 0, there is an mo(m) such that 


$ 
v(E™ — Exyom) < on (7) 
Let 


c 
Es = N Exot): 
m=1 





* See Problem 7, p. 280. 
3 See also Problem 3, p. 280. 
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Then E; satisfies the two conditions of the theorem. The fact that the 
sequence {/,,} is uniformly convergent on E; is almost obvious, since if 
x €E,, then, given anym=1,2,..., 


Ife) — fog <4 
m 


for every i > no(m). 

To verify condition 2), we now estimate the measure of the set E — Ey, 
noting first that u(E — E™) = 0 for every m. In fact, if x»¢ E — E™, 
then there are arbitrarily large values of 7 such that 


If) fed > =, 
m 


which means that the sequence {/,} cannot converge to fat the point xo. 
Therefore u(E — E™) = 0, as asserted, since {/,,} converges to f almost 
everywhere, by hypothesis. It follows from (7) that 


mm m $8 
UE — Exim) = wCE™ — Exim) < 5m" 
Therefore “i £ 
mE — By) =4(E — NBR m) = 4( UE ~ En) 


cs 


— 8, 
2 


< 2 Me — Em) < Py 


m= 


and hence w(E;) > u(E) - 8. 
Problem 1. Prove that the Dirichlet function 


ro) =| 


0 if x is irrational 


1 if x is rational, 


is measurable on every interval [a, 4]. 
Problem 2. Do the same for the function 


e ifx = Fis rational, 
q 


fx) = \4 
0 if x is irrational. 
Problem 3. Suppose f(x) is measurable on [a, 5]. Is g(x) = e*@ measur- 
able on [a, 5]? 
Problem 4. Prove that if fis measurable, then so is |/|. 


Problem 5. Let {f,,} be a sequence of measurable functions converging 
almost everywhere to a function f’ Prove that {f,} converges almost every- 
where to a function g if and only if fand g are equivalent. 
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Problem 6. A sequence {f,} of y-measurable functions is said to converge 
in measure to a function f if 


lim wx: 1f,0) —f@)| > 8} =0 
for every 5 > 0. Prove that if a sequence {f,} of measurable functions 
converges to f almost everywhere, then it converges to fin measure. 


Hint, Let A be the set (of measure zero) on which {f,} fails to converge 
to f, and let 


£8) = {: [f®) — fO) > 3}, 
R,(3) =U £,(2), (8) 


M =NR,(8). 
n=l 


Then the sets (8) are all measurable (why ?), and u(R,(8)) > u(M)asn— o, 
since R,(S) > R,(8) > «++. Prove that M@ < A and hence that p(M) =0 
(as always, we assume that u is complete). It follows that u(R,,(5)) > 0 as 
n—» oo. Now use the fact that E,,(3) < R,(8). 


Problem 7. Let {f,} be a sequence of measurable functions converging in 
measure to a function f. Prove that {/,} converges in measure to a function 
g if and only if f and g are equivalent. 


Problem 8. Given any positive integer k, consider the function 


if Laat ew ge 


7? @= k k 


0 otherwise, 
defined on the half-open interval (0, 1]. Show that the sequence 
ye pee a ti! le ia mH: fe: oe 
converges in measure to zero, but does not converge at any point whatsoever. 


Comment. Thus the converse of the proposition in Problem 6 is false. 
Instead we have the weaker proposition considered in the next problem. 


Problem 9. Prove that if a sequence {f,,} of functions converges to f in 
measure, then it contains a subsequence {/,,} converging to f almost 
everywhere, 


Hint. Let {8,} be a sequence of positive numbers such that 


lim 3,, = 0, 


nro 
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and let {e,} be a sequence of positive numbers such that 


oO 
ye, 0, 
n=1 
Let {n,} be a sequence of positive integers such that n, > n,_, and 


Hx) —f@Ol> i}<e, (kK =1,2,...). 


Moreover, let 


R= Obie) fol > 3) O=AR, 


t=1 


Then u(R,) > u(Q) as i— o, since Ry > R, > ---+. On the other hand, 


ro) 
u(R;) < Ye» 
r= 


and hence u(R,) > 0, so that u(Q) = 0. Now show that { Sn,} converges to 
fonE-@Q. 

Problem 10. Prove that a function f defined on a closed interval [a, b] is 
u-measurable if and only if, given any « > 0, there is a continuous function 
@ on [a, b] such that u{x:f(x) # o(x)} <e. 


Hint. Use Egorov’s theorem. 


Comment. This result, known as Luzin’s theorem, shows that a measurable 
function “can be made continuous by altering it on a set of arbitrarily small 
measure.” 


29. The Lebesgue Integral 


The concept of the Riemann integral, familiar from calculus, applies 
only to functions which are either continuous or else do not have “too many” 
points of discontinuity. Hence we cannot form the Riemann integral of a 
general measurable function f. In fact, f may be discontinuous everywhere, 
or it may even be meaningless to talk about the continuity of f in the case 
where f is defined on an abstract set. For such functions, there is another 
fully developed notion of the integral, due to Lebesgue, which is more 
flexible that the notion of the Riemann integral. 

Let f be a function defined on a closed interval [a, b] of the x-axis. 
Then to form the Riemann integral of f, we divide [a, b] into many sub- 
intervals, thereby grouping together neighboring points of the x-axis. On 
the other hand, as we will see below, the Lebesgue integral is formed by 
grouping together points of the x-axis at which the function f takes neigh- 
boring values. In other words, the key idea of the theory of Lebesgue 
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integration is to partition the range of the function f rather than its domain. 
This immediately makes it possible to extend the notion of integral to a very 
large class of functions. 

Another advantage of the Lebesgue integral is that it is constructed in 
exactly the same way for functions defined on an abstract “measure space”’ 
(an arbitrary set X equipped with a measure) as for functions defined on the 
real line. This is to be contrasted with the situation for the Riemann integral, 
which is first introduced for functions of a single real variable and then 
extended, with suitable modifications, to the case of functions of several 
real variables, but fails to make any sense at all for functions defined on an 
abstract measure space. 

In what follows, unless the contrary is explicitly stated, we will consider 
a o-additive measure pu. defined on a Borel algebra of subsets of a set X, 
with X as the unit. We will assume that all sets under consideration are 
p-measurable, and that all functions under consideration are defined and 
u-measurable on X. 


29.1. Definition and basic properties of the Lebesgue integral. Let f be a 
simple function, i.e., a u-measurable function taking no more than countably 
many distinct values 


Vas Vor sees Vnveee (1) 
Then by the (Lebesgue) integral of f over the set A, denoted by 
[£0 au, 
we mean the quantity 
> Yat(An) (2) 


where 
A,, = {x:x €A, f(x) = yr}, 


provided the series (2) is absolutely convergent. If the Lebesgue integral 
of fexists, we say that fis integrable or summable (with respect to the measure 
uw) on the set A. 


Example. Obviously, 
[a ‘du= [au = u(A). 
We now get rid of the restriction that the numbers (1) be distinct: 


Lemma. Given a simple function f defined on a set A, suppose A is a 
union 
A= U By, 
k 
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of pairwise disjoint sets B,, such that f takes only one value c, on B,. Then 
f is integrable on A if and only if the series 


p CyA( B;,) (3) 


is absolutely convergent, in which case 


Lf dy = 2 cu B,). 
Proof. Each set 
A,, = {x:x € A, f(x) = yn} 


is the union of the sets B, for which c, = y,. Therefore* 


x VnplAn) = 2 Jn > u(B,) = x Cy By). 


Moreover, since p. is nonnegative, we have 


2 [Yul WA.) — 2 Yul v(B;,) = > Ic, w(B;), 


so that the series (2) is absolutely convergent if and only if the series (3) 
is absolutely convergent. [J 


THEOREM |. Let fand g be simple functions integrable on a set A, and 
let k be any constant. Then f + g andkf are integrable over A, and 


FLL@) + sol du = f £0) du t [20 au, (4) 
icc) du =k [, f(s) dy. (5) 
Proof. Suppose f takes distinct values y, on sets F; < A, while g 
takes distinct values z; on sets G; © A, where i,j = 1,2,.... Then 
[,f@) du => ya, (6) 
[80 du = ¥ z0(G,). D 
3 


Clearly, f+ g takes the values c,; = y, + z; (not necessarily distinct) 
on the pairwise disjoint sets B,, = F, A G,. It follows from 


w(F;) = duh M G;), w(G;) = 2 uF, A G;) 





‘The notation > calls for the sum over all & such that cz = ya. 
Ch=Un 
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and the absolute convergence of the series (6) and (7) that the series 
> 2 €;34(B;5) =) > Oi + zi); OG) 
a z j 


is absolutely convergent. Hence, by the lemma, f+ g is integrable on 
A and 


[,UF@) + 8) du = FE Or + 2)MFL NG) 
= x ye(Fy) + 2 2u(G)). (8) 
Comparing (6)-(8), we get (4). The proof of (5) is trivial. JJ 


THEOREM 2. Let fbe a bounded simple function on A, where | f(x)| < M 
if x €.A, Then f is integrable on A and 








[,f0) du | < Mua). 


Proof. If f takes values y, on sets A, © A (n = 1,2,...), then 











| {, 0) a 


where we have incidentally proved the integrability of fon A (how?). ff 


> Yan) | < ¥ lynl w(An) < M 3 u(A,) = Mu(A), 
n n n 
Next we remove the restriction that f be a simple function: 


DEFINITION. A measurable function f is said to be integrable (or 
summable) on a set A if there exists a sequence {f,} of integrable simple 
functions converging uniformly to f on A. The limit 


lim J fas) du 9) 


is then called the (Lebesgue) integral of f over the set A, denoted by 


[,f@) du. 


This definition relies tacitly on the following conditions being met: 


1) The limit (9) exists (and is finite) for any uniformly convergent sequence 
of integrable simple functions on A; 
2) For any given /, this limit is independent of the choice of the sequence 


3) For simple functions, the definitions of integrability and of the integral 
reduce to those given on p. 294. 
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All these conditions are indeed satisfied. Condition 1) is an immediate 
consequence of the estimate 


[ fue) du — ff fa) de | =| [ Un) — A001 du 
= (A) ay Fn) — frlx)I, 


implied by Theorem 1 and 2. To prove 2), suppose the sequences {/,} and 
{f*} both converge uniformly to f, but 


tim f f.(x) du # lim [F800 de. 


Let {¢,,} be the sequence 
LifioloaIes ens tata sce 


Then {¢,,} converges uniformly to /, but 








tim [ n(x) dy 


fails to exist, contrary to condition 1). Finally, to prove 3), if fis simple, 
we need only consider the trivial sequence {/,} with general term f, = 


THEOREM 1’. Theorem 1 continues to hold if f and g are arbitrary 
measurable functions integrable on A. 


Proof. An immediate consequence of Theorem 1, after taking suitable 
uniform limits of integrable simple functions, Jj 


THEOREM 3. If ¢ is nonnegative and integrable on A and if |f(x)| < 
(x) almost everywhere on A, then f is also integrable on A and 


[,£09 de] < fo) du. (10) 


Proof. If f and 9 are simple functions, then, by subtracting a set of 
measure zero from A, we get a set A’ which can be represented as a 
finite or countable union 








A’ =UA, 
of subsets A,, < A’ such that 


S() =n) 9(x) = 5, 
for all x € A, and 
lja,,<b, (n#=1,2,...). 


Since 9 is integrable on A, we have 


Elan] w(An) < Zbwldn) = fe) de = fo) de ND 
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(see Problem 3b). Therefore fis also integrable on A, and 


{fer du| =| f..£69 da | =| aqatay 


Comparing (11) and (12), we get (10). 

In the case where f and ¢ are arbitrary measurable functions, let 
{f,} and {@,} be sequences of simple functions converging uniformly to f 
and 9, respectively, constructed in thessame way as in the proof of 
Theorem 5, p. 286. Then clearly 

fn) < Pax) (= 1,2,...) 


on A’. Moreover each 9, is integrable, since 9 is integrable by hypoth- 
esis. It follows that each f, and hence / itself is integrable, where 


[OO de < J ea(s) da. 


Taking the limit as n > oo, we again get (10). Jj 











< > la,lu(4,). (12) 


Coro.iary. If fis bounded and measurable on A, then f is integrable 
on A. 


Proof. Choose ¢(x) = M, where 
M = sup [f(). 
wed 


29.2. Some key theorems. We now prove some important properties of 
the Lebesgue integral, regarded as a set function 


F(A) = |, f() dp (13) 
defined on a system of measurable sets (with the integrand f held fixed). 
THEOREM 4. Let 


A=UA, 


be a finite or countable union of pairwise disjoint sets A,, and suppose f is 
integrable on A. Then f is integrable on each A,, and 


[f0) du = ff) de, (14) 
where the series on the right is absolutely convergent. 


Proof. First let f be a simple function, taking the values y,, yo, ... 
and let 


By, = {x:x € A, f(x) = y,}; Bay = {x:x € An, f (X) = yy} 
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Then 
[,f@) de =X yB) = FZ eB) 
=I MwHBm) => Te f(x) du. (15) 


Since fis integrable on A, the series > y,u(B,) converges absolutely, and 


k 
hence so do the other series in (15). (Here we use the nonnegativity of 
the measure uw.) In particular, fis integrable on each set A,. 
Next let f be an arbitrary measurable function integrable on A. Then, 
given any « > 0, there is a simple function g integrable on A such that 


|\f)—g@|<e (xe A). (16) 


For g we have 
[,e) du = EJ 2) dy, (17) 


as just shown, where g is integrable on each A, and the series converges 
absolutely. Hence, by (16), fis also integrable on each A, and 





2 | J ee (x) du — f 4,80) du|< 2X eu(A,) = en(A), 


<eu(A), 








[,f2) du — fe) de 
which, together with (17), implies the absolute convergence of the series 


X [, fe) de 


and the estimate 





[,f) da EJ $0) de | < 2ep(A). (18) 
But (18) implies (14), since « > 0 is arbitrary. Jj 


Coro.iary. If f is integrable on A, then f is integrable on every 
measurable subset A’ < A. 


Proof. Think of A as the union of the disjoint sets 4’ and A — A’. jj 


Remark. A succinct way of expressing the property (14) is to say that 
the set function (13) is o-additive. 


TuEoreM 5 (Chebyshev’s inequality). If f is nonnegative and integrable 
on A, then 


u{x:xE€ A, f(x) > c} < = [,f0) du. 
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Proof. If 
A’ = {x:x€ A, f(x) > c}, 
then 


[f@de=f fedde+ fi seddur f fe) de> cud) 
(see Problem 4a). jj 


CoroLLary. If 
[,\/C91 du = 0, 
then f (x) = 0 almost everywhere. 
Proof. By Chebyshev’s inequality, 
1 
ufesx EAI > | < nf IfGldu=0 
n 
for alln = 1,2,... . Therefore 
u{x:x EA, f(x) FO} < Sueex EA, | f(x) > | =0. § 


TueoreM 6. If fis integrable on a set A, then, given any « > 0, there 
is a 8 > 0 such that 


<eé 





| [,709 a 
for every measurable set E < A of measure less than 8. 
Proof. The proof is immediate if fis bounded, since then 
[feo du | < [,1/@)| du < sup |f)| wE) 
ree 


(see Problem 4c). In the general case, let 
A, = {x:xE€A,n < |f)|<n-+ 1}, 





N 
By ioe U A,, 
n=0 
Cy=A— By. 
Then, by Theorem 4, 


[lfC1 de = f Uso dy. 
Let N be such that 


Lent du =f, feo da < §. 
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and let 


0<3<————. 
aN + 1) 
Then p(£) < 8 implies 


[fe an| =f\felde= fo Ilda t+ fo UFGOl de 


& 4 
<(N + Du(é) + J, foildu<5+i—e I 





Remark. The property figuring in Theorem 6 is expressed by saying that 
the set function (13) is absolutely continuous with respect to the measure wu. 
Problem I, Prove that the Dirichlet function 
1 if x is rational, 
{= 


0 if x is irrational 


fails to have a Riemann integral over any interval [a,b]. Prove that the 
Lebesgue integral of f over any measurable set A exists and equals zero. 


Problem 2. Find the Lebesgue integral of the function 


z ifx =? is tational, 
fx) = \4 q 

1 if x is irrational 
over the interval [a, 5]. 


Problem 3. Prove that 
a) If fis integrable on a set Z of measure zero, then 


[,f@) du = 0; 
b) If fis integrable on A, then 
[,. £1) du = J £0) du 
for every subset A’ < A such that u(A — A’) = 0. 
Comment. We can regard a) as a limiting case of Theorem 6. 


Problem 4. Prove that 


a) If fis nonnegative and integrable on A, then 


[£09 du > 0; 
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b) Iffand g are integrable on A and f(x) < g(x) almost everywhere, then 


[fda < ff 90 dus 


c) If fis integrable on A and m < f(x) < M almost everywhere, then 
mu(A) < | fx) du < Mu(A). 
Problem 5. Prove that the existence of either of the integrals 
[fords — f1f0dldu 
implies the existence of the other. 


Problem 6. Let 
A=UA, 


be a finite or countable union of pairwise disjoint sets A,, and suppose f 
is integrable on each A, and satisfies the condition 


df, UO du < @. (19) 
Prove that fis integrable on A. 


Hint. If f is simple, with values y,, ye,..., let the sets B, and B,, be 
the same as in the proof of Theorem 4. Then 


[,\/00l de = fel eu. 
The absolute convergence of (19) implies the convergence of 


x2 el UBrr) = 2 Wel > w(B,y) = Xb! p(B,), 


and hence the integrability of fon A. In the general case, let g be a simple 
function approximating f, and show that (19) implies the convergence 


> J, lel de, 
so that g, and hence f, is integrable on A. 
Comment. This is essentially the converse of Theorem 4. 


Problem 7. Let u. be a o-additive measure defined on a Borel algebra Y, 
of subsets of a given set X, and let f be nonnegative and integrable on X 
(with respect to 2). Prove that the set function 


F(A) = | f@) dp 
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is itself a c-additive measure on %,, with the property that F(A) =0 
whenever (A) = 0. 


Problem 8. Suppose f is integrable on sets Ay, Ao,..., A,,... Such that 


A, > A, > °*' 2A, > °-,—" 

and let 

A=NA, 
Does ice 

[, f@ de 
converge to 

9 
[, FC) du? 


30. Further Properties of the Lebesgue Integral 


30.1. Passage to the limit in Lebesgue integrals. The problem of taking 
limits behind the integral sign, or equivalently of integrating a convergent 
series term by term, is often encountered in analysis. In the classical theory 
of integration, it is proved that a sufficient condition for taking such a limit 
is that the series (or sequence) in question be uniformly convergent. We 
now examine the corresponding theorems for Lebesgue integrals, which 
constitute a rather far-reaching generalization of their classical counterparts. 


THEOREM | (Lebesgue’s bounded convergence theorem). Let {f,} be a 
sequence of functions converging to a limit f on A, and suppose 


IFAC < eo) (xe A,n=1,2,...), 
where 9 is integrable on A. Then f is integrable on A and 


tim f fa) du = J 0 dp. 


Proof. Clearly | f(x)| < ¢(x), and hence fis integrable, by Theorem 3, 
p. 297. Let 
A, = {x:k —1 < o(x) < k}, 


By = U Ay = {x:9() > m}. 
kom 
By Theorem 4, p. 298, 
[,e@) du = J, @) dp, () 
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where the series on the right is absolutely convergent. By the same token, 


Jae) du = Xf, oCe) dy. 


Given any « > 0, there is an integer m such that 
& 
[00 du <2, 


since the series (1) converges. Moreover, 9(x) <m on A — B,,. By 
Egorov’s theorem (Theorem 12, p. 290), 4 — B,, can be represented in 
the form 


A—B, =CUD, 
where {f,,} converges uniformly to fon C and 


u(D) < eo 
5m 
Let N be such that 


Lin {Ol <= ; i ra 
on Cifn > N. Then 


[cd -fooldu =f fede — [> food t ff) de 
— fF da + [@) — FOI de, 


and hence 





(x) — f(x)] du | 





[, f0)— J, 0) de | = 
< Ac) du + Vs S| du + fl fil du 
a [it (x)| du + oe — f(x)| du 





€ € 
< shoot: _ — 
5 - 5 - as 0) 
which implies (1), since ¢ > 0 is arbitrary. = 


CoroLiary. If | f,(x)| < M and.f,, — f, then 
Jim J) fale) du =f) 0) du. 


Proof. Choose 9(x) = M, noting that every constant is integrable 
on A. ff 
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Remark, The values taken by a function on a set of measure zero have 
no effect on its integral. Hence in Theorem 1 we need only assume that {/,} 
converges to f almost everywhere and that the inequality [f,(x)| < 9(x) 
holds almost everywhere. 


THEOREM 2, (Levi). Suppose 
A@) <A) < ++ < fi) <--> 
on a set A, where the functions f,, are all integrable and 
[, fr) du < M (n= 1,2,...) (2) 
for some constant M. Then the limit 
f(x) =lim f(x) 


exists (and is finite) almost everywhere on A.> Moreover, f is integrable 
and 


Him [fad du =| f) dp. 


Proof. It can be assumed that f{(x) > 0, since otherwise we need 
only replace the f, by f, — fi. Let 


Q = {x:x € A, f(x) > oo}. 
Then clearly 


2=NV Q”, 
where on 
QW) = {x:x EA, f(x) > r}. 


It follows from (2) and Chebyshev’s inequality (Theorem 5, p. 299) that 


w(Q) < “ 


Moreover 
M 
“(UV oF <=, 

n r 

since 
AM QM cee AMC tee 

But 

Qa-¢UQ”? 


n 





5 The function f can be defined in an arbitrary way on the set E where the limit (2) 
fails to exist, for example, by setting f(x) = 0 on E. 
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for any r, and hence 
M 
u(Q) < a 


Since r can be arbitrarily large, this implies 
w(Q) = 0, 


thereby showing that the sequence {/,(x)} has a finite limit f(x) for 
almost all x € A. 
Now let 
A,={xir—1<f(x) <7}, 


and Jet ¢ be the simple function such that 


o(x) =rif x eA, (ry =1,2,...). 
Moreover, let 


Ss 
B, = UA,. 
r=1 
Since the functions f, and fare bounded on B, and since 


g(x) < f(x) + 1, 
we have 


fo) du < fi f(x) du + wld) 
= lim. J fo dy. + p(A) < M + pA), 


where we use the corollary to Theorem 1. But 


J,,2) de =X ral 4y), 


= 


and hence 
SrA, < M+ uA) 
for alls =1,2,... . Therefore 
SrA, < , 


ie., 9 is integrable on A, with integral 
[,9@) du = > rut). 


Since f,(x) < (x), the validity of (3) is now an immediate consequence 
of Lebesgue’s bounded convergence theorem (Theorem 1). § 
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Coro_Liary. If 9,(x) > 
> [,0(2) du < 2, 


> 


then the series 
> 9,(x) 
K=1 


converges almost everywhere on A and 


2 xc du =f , ( S09) du. 
Proof. Apply Theorem 2 to the functions 


Ful2) =¥ 403) 


THEOREM 3 (Fatou). Let {f,,} be a sequence of nonnegative functions 
integrable on a set A, such that 


[foO)de<M  (n=1,2,...). 


Suppose {f,} converges almost everywhere on A to a function f. Then f is 
integrable on A and 


a f(x) du < M. 
Proof. Let 

Pn(X) = Aus Sil). 

Then ¢,, is measurable, since 
{x: n(x) < ¢} Ux f(x) < ch. 

Moreover 

0 < 9n(x) < fa); 
and hence 9, is integrable, by Theorem 3, p. 297, with 


[edu < ffdu<M  (n=1,2,..). 
Clearly 


91(x) < (x) < ae <a 9, (x) <0 ', 
and 


lim n(x) = f(x) 


almost everywhere. Applying Theorem 2 to the sequence {9,}, we find 
that fis integrable and 


[, £0) du =tim J en()du <M. W 
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30.2. The Lebesgue integral over a set of infinite measure. So far all our 
measures have been finite (except for Remark 3, p. 267), and hence everything 
said about the Lebesgue integral and its properties has been tacitly understood 
to apply only to the case of functions defined on sets of finite measure. 
However, one often deals with functions defined on a set X of infinite measure, 
for example, the real line equipped with ordinary Lebesgue measure. We 
will confine ourselves to the case of greatest practical interest, where X can 
be represented as a union 


X=UX,, U(X,) <0 (3) 


of countably many sets X,, each of finite measure with respect to some 
o-additive measure py. defined on a o-ring of subsets of X (the sets of finite 
measure). Such a measure is called o-finite. For example, Lebesgue measure 
on the line, in the plane, or more generally in n-space is o-finite. For 
simplicity, and without loss of generality (why?), we will assume that the 
sequence {X,,} is increasing, i.e., that 


XO MS OX, Sees, (4) 


A sequence {X,,} satisfying the conditions (3) and (4) will be called exhaustive. 
For example, the sequence {£,,} in Remark 3, p. 267 is an exhaustive sequence 
(with respect to ordinary Lebesgue measure), whose union is the whole 
plane. 

Now let f be a measurable function on X.° Then fis said to be integrable 
(or summable) on X if it is integrable on every measurable subset A ¢ ¥ 
and if the limit 


tim J 0) de (5) 


exists (and is finite) for every exhaustive sequence {X,,}. The limit (5) is then 
called the (Lebesgue) integral of f over the set X, denoted by 


[£0 de. 


Remark 1. The limit (5) is independent of the choice of the exhaustive 
sequence {X,,}. In fact, suppose 


lim i f(x) dy # lim i- f(x) du, 





° A real function y = f(x) is now said to be measurable if the set f(A) 1 X, is 
measurable for every X, and every Borel set A (this being the obvious slight generalization 
of Definition 1, p. 284). 


SEC. 30 FURTHER PROPERTIES OF THE LEBESGUE INTEGRAL 309 


where {X*} is another exhaustive sequence. Define a new sequence {Q,} 
such that 

Q,= X, 

Qo, is any set of {X*} containing Q,,_,, 

Q.,41 is any set of {X,,} containing Q., 
(why do such sets exist ?). Then {Q,,} is exhaustive, but 


lim [, £@) de 


fails to exist, contrary to hypothesis. 


Remark 2, The integral of a simple function is defined in the same way 
as on p. 294. It is clear that a necessary (but not sufficient) condition for 
integrability of a simple function f is that f take every nonzero value on a set 
of finite measure. 


30.3. The Lebesgue integral vs. the Riemann integral. Finally we examine 
the relation between the Lebesgue integral and the Riemann integral, 
restricting ourselves to the case of ordinary Lebesgue measure on the line: 


THEOREM 4. If the Riemann integral 
b 
r= [? f@) dx 
exists, then f is Lebesgue integrable on [a, b] and 
i iy SO) w= 1 (6) 
Proof. Introducing the points of subdivision 


Ren eA eer | ne 








2 
be the corresponding Darboux sums, where M,,, is the least upper bound 


and m,,, the greatest lower bound on fon the subinterval x, 4 <x < x, 
By the definition of the Riemann integral, 


I=limA, =lim8,. 


nao no 
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Consider the functions 
FAX) = May if Xp << Xp 
Sa) = Man ifx,1< x < X,, 
f(b) = fn(0) =f). 
Then clearly 


FoF du= Ae ff) du = 8y. 7) 
Moreover, 


AD > A(t) > +--+ >h() > +++ > SX), 
AQ) < fi) < +++ < fi) < +++ <f), 


and hence 


lim f,(x) = 7) > $0, 
lim f(x) = f(x) < f(Q). 
Using (7) and Theorem 2, we find that 


lee Fe) di eri Joa’ (x) di prints =! 


=limd,=lim [fC d=, Sedu (8) 


n> 0 


(see also Problem 2). Therefore 


fam W@) —fO) du = J) —f@)} du = 0, 


and hence 
fx) — fox) = 0 
almost everywhere, by the corollary on p. 300. In other words, 
F(x) =f) =f) (9) 


almost everywhere. Comparing (8) and (9), we get (6). jj 
Problem 1. Prove that 


Bim ffi)8) du = J) £0) dacs) 


if the sequence {/,} satisfies the conditions of Theorem 1 (as stated miore 
generally in the remark on p. 305) and if g is essentially bounded on A in 
the sense that there is a constant M > 0 such that |g(x)| < M almost every- 
where on A. 
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Comment. If g is essentially bounded on A, then the quantity 


ess sup |g(x)| = inf sup ig}, 
eA ZCA \reA-Z 
u(Z)=0 
called the essential supremum of g on A, is finite. 


Problem 2, Prove that Theorem 2 remains valid if 
AQ) > fil) > °° > fi) > °° 
and if (2) is replaced by the condition 


ff) du>M (n=1,2,...). 


Problem 3. Consider the system ¥ of all subsets of the real line con- 
taining only finitely many points, and let the measure (A) of a set Ae 
be defined as the number of points in A. Prove that 


a) & is a ring without a unit; 
b) uw is not o-finite. 


Problem 4. Why do we talk about a o-ring rather than a o-algebra on 
p. 308? 


Problem 5. Prove that if a function f vanishes outside a set of finite 
measure, then its Lebesgue integral as defined on p. 308 coincides with its 
Lebesgue integral as previously defined. 


Problem 6. Show that the analogue of the definition on p. 296 cannot be 
used to define the Lebesgue integral in the case where A is of infinite measure. 


Hint. Give an example of a uniformly convergent sequence {/,,} of 
integrable simple functions such that 


tim J f.G) da 
fails to exist. 


Problem 7. Which of the theorems of Sec. 29 continue to hold for 
integrals over sets of infinite measure? 


Hint. The corollary on p. 298 fails if A is of infinite measure. 


Problem 8. Verify that Theorems 1-3 of Sec. 30.1 continue to hold for 
integrals over sets of infinite measure. 


Problem 9. Given a nonnegative function f, suppose the Riemann integral 


(P I(x) dx 
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exists for every « > 0 and approaches a finite limit as e > 0+, so that the 
improper Riemann integral 
[?f@) dx =1im f? fp dx (10) 
¢ e704 ° ate 
exists. Prove that f is Lebesgue integrable on [a, 5] and 


Fug SO) du = f° £0) ax. 


Comment. On the other hand, if fis of variable sign and if 


li t) deus 
lim P LfGa| dx = 0 


then the Lebesgue integral of f over [a, b] fails to exist, even if the improper 
Riemann integral (10) exists. In fact, by Problem 5, p. 302, summability 
of f would imply that of | f]. 


Problem 10. Prove that the integral 
i os sin dx 
°x x 
exists as an improper Riemann integral, but not as a Lebesgue integral. 


Problem 11, Suppose f is Riemann integrable over an infinite interval 
(such an integral can exist only in the improper sense). Prove that f is 
Lebesgue integrable over the same interval if and only if the improper 
integral converges absolutely. 


Comment. For example, the function 


sin x 
x) =-—— 
f(x) : 
is not Lebesgue integrable over (— 00, 00), since 
On the other hand, fhas an improper Riemann integral equal to 


o sinx 
Sse 
2a S 


sin x 
x 








dx = ©. 


9 


DIFFERENTIATION 


Let f be a summable function defined on a space X, equipped with a 
o-additive measure u. Then the (Lebesgue) integral 


[£0 a (1) 


exists for every measurable E © X, thereby defining a set function on the 
system %, of all measurable subsets of X. If X is the real line, equipped 
with ordinary Lebesgue measure u, and if FE = [a, b] is a closed interval, we 
write (1) simply as 


Pye ax, 


or equivalently as 
Prod (2) 


in terms of the new dummy variable of integration t (here we anticipate 
subsequent notational convenience). Then (2) is clearly a function of the 
lower limit of integration a and the upper limit of integration b. Suppose we 
fix a, but leave 6 variable, indicating this by replacing b by the symbol x. 
Then (2) reduces to the “indefinite Lebesgue integral” 


f ”f) dt, 


with its upper limit of integration variable. 
Now let f be continuous, and let F have a continuous derivative. Then 
it will be recalled from elementary calculus that the connection between 


313 
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the operations of differentiation and integration is expressed by the familiar 
formulas 


d fx 
5 ff at =I, @) 


i. * F(t) dt = F(x) — F(a). (4) 


This immediately suggests two questions: 


1) Does (3) continue to hold for an arbitrary summable function f? 
2) What is the largest class of functions for which (4) holds? 


These questions will be answered in Secs. 31-33. The study of the general 
set function (1) will be resumed in Sec. 34. 


31. Differentiation of the Indefinite Lebesgue Integral 


31.1. Basic properties of monotonic functions. We begin our study of the 
indefinite Lebesgue integral 


F(x) = [7 dt (1) 


as a function of its upper limit by making the following obvious but important 
observation. If f is nonnegative, then (1) is a nondecreasing function. 
Moreover, since every summable function f(t) is the difference 


fMO=AO -LO 


of two nonnegative summable functions (which?), the integral (1) is 
the difference between two nondecreasing functions. Hence, the study of the 
Lebesgue integral as a function of its upper limit is closely related to the 
study of monotonic functions. Monotonic functions are interesting in their 
own right, and have a number of simple and important properties which 
we now discuss. Here all functions will be regarded as defined on some 
fixed interval [a, b] unless the contrary is explicitly stated. 


DEFINITION 1, A function f is said to be nondecreasing if x, < x» 
implies f (x1) < f(%2) and nonincreasing if x, < x, implies f(x) > f (x2). 
By a monotonic function is meant a function which is either nondecreasing 
or nonincreasing. 


DEFINITION 2. Given any function f, the limit 


lim f (Xo + é) 
e770 


e>0 
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(provided it exists) is called the right-hand limit of f at the point Xo, 
denoted by 


Sf (% + 0). 
Similarly, the limit 


lim f(% — €) 
270 


s>0 
is called the left-hand limit of f at xo, denoted by 
f (%o — 9). 


Remark. If 
So + 0) =f(% — 9), 


then clearly f is either continuous at xy or has a removable discontinuity 
at Xo. 


DEFINITION 3. A function f is said to be continuous from the right at 


Xo if 

Sf (%o) =f%o + 0), 
and continuous from the left at Xo if 

Sf (%o) =f (% — 0). 


DEFINITION 4. By a discontinuity point of the first kind of a function f 
is meant a point Xq at which the limits f (Xp + 0) andf (xy — 0) exist but are 
unequal. The difference 


I (% + 9) —f(%o — 0) 
is then called the jump of f at Xp. 


Example. Given no more than countably many points 


Xyy Xan +2 + a Xyy eee 
in the interval [a, 5], let 
hy, hes. 0 y Aggy ess 


be corresponding positive numbers such that 
dh, < 0. 
n 

Then the function 


S(*) = 2% he (2) 


where the sum is over all n such that x, <x, is obviously nondecreasing. 
A monotonic function of this particularly simple type is called a jump 
function. A jump function such that 


Xy<Xyg<et <XA<tee, 


316 DIFFERENTIATION CHAP. 9 


is called a step function. For an example of a jump function which is not 
a step function, see Problem 1. 


We now establish the basic properties of monotonic functions. To be 
explicit, we will talk about nondecreasing functions, but clearly everything 
carries over automatically to the case of nonincreasing functions. 


THEOREM 1. Every nondecreasing function f on [a,b] is measurable 
and bounded, and hence summable.1 


Proof. Since f(x) < f(6) for all x € [a, 5], f is obviously bounded. 

Consider the set 
E, = {x:f(x) < c}. 

If EZ, is-empty, then E, is (trivially) measurable. If E, is nonempty, let 
d be the least upper bound of all x € E,. Then E, is either the closed 
interval [c, d], if de £,, or the half-open interval [a, d) if dé E£,. In 
either case, E, is measurable. §f 

THEOREM 2. Every discontinuity point of a nondecreasing function is 
of the first kind. 

Proof. Let x be any point of [a 5], and let {x,} be any sequence 
such that x, < X9,X,— Xo. Then { f(x,)} is a nondecreasing sequence 
bounded from above, e.g., by the number f(x). Therefore Eis 1 fn) 


exists for any such sequence, i.e., f(x» — 0) exists. The existence of 
f(%o + 0) is proved in the same way. ff 
Obviously, a nondecreasing function need not be continuous. However, 
we have 
THEOREM 3. A nondecreasing function can have no more than countably 
many points of discontinuity. 
Proof. The sum of the jumps of fon the interval [a, b] cannot exceed 


f() —f (a). Let J, be the set of all jumps greater than 1/n, and let J be 
the set of all jumps regardless of size. Then obviously 


J=UJ,, 


n=1 
where each J, is a finite set. Hence J has no more than countably many 
elements. §f 
TuHeorEM 4. The jump function (2) is continuous from the left. More- 
over, all the discontinuity points of f are of the first kind, with the jump at x, 
equal to h,. 


1 See the corollary on p. 298. 
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Proof. Clearly, 
f(x —9)=limf(x —e)=lim > A,. 
6-0 


670 wy<a-—e 
e>0 s>0 
But if x, < x, then x, < x — « for sufficiently small « > 0. Therefore 
lm > fh, =f(x), 
670 &n<a-£ 
e>0 
and hence 


f(x — 0) =f (x). 
If x coincides with one of the points x,, say with x,,, then 


Eng + 0) = Tim fing + 6) = Tim > A= Fh, 


£0 wn<angte nang 


f Rng +) —f %ng — 9) = Ang Hl 


THEOREM 5. If f is continuous from the left and nondecreasing, then 
f is the sum of a continuous nondecreasing function » and a jump func- 
tion wb. 

Proof. If x1, X2.... are the discontinuity points of f, with corre- 
sponding jumps hy, he, ... , let 


b(x) == hy 
(x) = f(x) — $@). 
p(x”) — 9’) = [/@") — £9] — LO") — 4), 


where the expression on the right is the difference between the total 
increment of f on the interval [x’, x”] and the sum of its jumps on 
[x’, x”], Le., e(x”) — o(x’) is the measure of the set of values taken by 
fat its continuity points in [x’, x"]. This quantity is clearly nonnegative, 
and hence ¢ is nondecreasing. Moreover, given any point x é€ [a, b], we 
have 
g(x — 0) = lim f(x — «) — lim W(x — <) = f(x —0)— 3 A,, 
270 270 Bn <e 


e>0 e>0 


which implies 


Then 


oe + 0) = lim f(x + ¢) — lim Yee +2) =f + 0) — Fhe 
e> &> eaSe 
e>0 e>0 

and hence 


p(x + 0) — o(x — 0) = f(x + 0) —f(x — 0) —h =0, 


where h is the jump of ¥ at x. It follows that @ is continuous at every 
point x € [a,b]. ff 
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31.2. Differentiation of a monotonic function. The key result of this 
section (see Theorem 6 below) will be to show that a monotonic function f 
defined on an interval [a, b] has a finite derivative almost everywhere on [a, 6]. 
Before proving this proposition, due to Lebesgue, we must first introduce 
some further definitions and then establish three preliminary lemmas. 

The derivative of a function f at a point x» is defined in the familiar way 
as the limit of the ratio 


fe) =f) es 
as X-»>X,. Even if this limit fails to exist, the following four quantities 
(which may take infinite values) always exist: 


1) The lower limit of (3) as x > x» from the left, denoted by A;; 

2) The upper limit of (3) as x —> x9 from the left, denoted by A,;? 
3) The lower limit of (3) as x > x, from the right, denoted by Ag; 
4) The upper limit of (3) as x —> x from the right, denoted by Ap. 


These four quantities, with the geometric meaning shown in Figure 17, are 
called the derived numbers of f at x9.* It is clear that the inequalities 


Ay < Ay, Ae < Ap (4) 


always hold. If A, and A, exist and are equal, their common value is just 
the left-hand derivative of f at x9. Similarly, if Ag and Ap exist and are 
equal, their common value is just the right-hand derivative of f at x». More- 
over, f has a derivative at x, if and only if all four derived numbers 4,, Ay, 








| 
| 
| 
| 
es niey 
%0 
Ficure 17 
? Upper and lower limits are defined on p. 111. 


5 To distinguish these quantities further, we can call A, the /eft-hand lower derived number, 
Ap the right-hand upper derived number, and so on. 
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Ap and Ag exist and are equal at x9. Hence the italicized assertion at the 
beginning of this section can be restated as follows: For a monotonic function 
defined on an interval [a, b], the formula 


holds almost everywhere on [a, 6]. 


DerFINITION 5. Let f be a continuous function defined on an interval 
(a, b]. Then a point xy € [a, b] is said to be invisible from the right (with 
respect to f) if there is a point © such that xy < & < band f(x») < f(&), 
and invisible from the left if there is a point & such thata < & < x and 


f(%0) < f (8). 


Example. In Figure 18, the points belonging to the intervals [a,, b,) and 
(a2, b,) are invisible from the right (interpret the word “‘invisible’’). 


Lemma | (F. Riesz). The set of all points invisible from the right with 
respect to a function f continuous on [a, b] is the union of no more than 
countably many pairwise disjoint open intervals (a, b,),* such that 


Sa) < fo) &=1,2,...). (5) 


Proof. If xo is invisible from the right with respect to f, then the 
same is true of any point sufficiently close to x9, by the continuity of f. 
Hence the set of all points invisible from the right is an open set G. It 
follows from Theorem 6, p. 51 that G is the union of a finite or countable 
system of pairwise disjoint open intervals. Let (a,, b,) be one of these 
intervals, and suppose 


Sf (ay) > f (by)- (6) 





Ficure 18 


4 However, if a, = a@ (say), then in some cases (a;, 6,) should be replaced by the half- 
open interval [a,, 5:), as in Figure 18. This is permissible, since [a,, 5) is open relative to 
{a, 5]. 
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Then there is an (interior) point x, € (a, b,) such that f(x») > f(6,). 
Of the points x € (a,, b,) such that f(x) = f(x»), let x* be the one with 
largest abscissa (x* may coincide with x9). Since x* belongs to (a, b,) 
and hence is invisible from the right, there is a point § > x* such that 
S(® > f(x*). Clearly & cannot belong to (a,, b,), since x* is the point 
x with largest abscissa for which f(x) = f(x), while f(5,) < f(x»), so 
that & € (a,, b,) would imply the existence of a point x > x* such that 
f(x) =f(%o). On the other hand, the inequality € > b, is also im- 
possible, since it would imply f(,) < /(xo) < f(&) despite the fact that 
b, is not invisible from the right. Thus (6) leads to a contradiction 
(obviously € + 6,). It follows that f(a,) < f(d,). 


Lemma 1’. The set of ail points invisible from the left with respect to 
a function f continuous on [a, b] is the union of no more than countably 
many pairwise disjoint open intervals (a,, by), such that 


f@>fo) (k=1,2,...). 
Proof. Virtually the same as that of Lemma l. ff 


LemMA 2. Let f be a continuous nondecreasing function on [a, b], with 
Az and Ap as two of its derived numbers. Given any numbers c, C and e 
such that 


0 Os Cc < oe, = a > 
let E, be the set 
= {x:Az <c, Ap > C}. 
Then 
u{x:xe EO (a, B)} < p(B — «) 


for every open interval (a, 8)  [a, 5]. 


Proof. Let xq be a point of (a, 8) for which A; < c. Then there is a 
point § < x such that 
LE) ~ fo) — 


—E— Xo 


SE) — ¢ > f (%) — xp. 


Therefore x, is invisible from the left with respect to the function 
f(x) — ex. Hence, by Lemma 1’, the set of all such x, is the union of 
no more than countably many pairwise disjoint open intervals (,, B,) © 
(a, 8), where 


i.e., such that 


S (ay) — ce, > f(B.) — cBy, 
F (Bx) — (ap) < (Be — %)- (7) 


or equivalently 
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Let G, be the set of points in («,, 8,) for which Ay > C. Then, by 
virtually the same argument together with Lemma 1, G, is the union of 
no more than countably many pairwise disjoint open intervals («,,, 8:,)» 
where 


By, ~ %, < a Ufs,) — fle, (8) 


(why ?). Clearly E, Q («, 8) is covered by the system of intervals (%,,; Bin) 
Moreover, it follows from (7) and (8) that 


© Ges ) < EEGs.) — Sted 


<G2UC)—S@1< ZB G.— 4) < B—o)- I 
We are now in a position to prove 


THEOREM 6 (Lebesgue). A monotonic function f defined on an interval 
[a, 5] has a finite derivative almost everywhere on [a, b]. 


Proof. There is no loss of generality in assuming that f is non- 
decreasing, since iff is nonincreasing, then obviously —f is nondecreas- 
ing. But if —f has a derivative almost everywhere, then so does f, We 
also assume that fis continuous, dropping this restriction at the end of 
the proof. It will be enough to show that the two inequalities 


Ap < +o (9) 
and 


ApS Ap (10) 


hold almost everywhere on [a, 5], for any continuous nondecreasing 
function. In fact, setting f*(x) = —f(—x), we see that f* is continuous 
and nondecreasing, like f itself. Moreover, it is easily verified that 


rE = Ars AR = My, 


where Af and Aj are the indicated derived numbers of f*. Therefore, 
applying (10) to f*, we get 

AE > Ar 
or 

Ar> Ax » (65)) 
Combining the inequalities (10) and (11), we obtain 


Arn < hy < Ay < dp < Ap, 
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after using (4). Thus if (9) and (10) hold almost everywhere, we have’ 
—0o <A, =A, =Ap=Ar< +o 
almost everywhere, and the theorem is proved. 
To prove that Ap < +00 almost everywhere, we argue as follows: 
If Ap = +0 at some point Xo, then, given any constant C > 0, there is 
a point € > x, such that 
f® —f%) >C, 


E— Xo 
F(8) — f(x%o) > CE — Xo), 


FE) — CE > f(%) — Cxo. 


Thus Xo is invisible from the right with respect to the function f(x) — Cx. 
Hence, by Lemma 1, the set of all points x» at which Ap = + 00 is the 
union of no more than countably many open intervals (a,, 5,), whose 
end points satisfy the inequalities 


I (a) — Ca, < f(b,) — Ch, 


S(b,) — f(@) > Cb, — a). 


Dividing by C and summing over all the intervals (a,, b,), we get 


ie., 


or equivalently 


or 


Sb ay) < ALG), LOL) 


But Ccan be made arbitrarily large. Hence the set of points where Ap = 
+o can be covered by a collection of intervals the sum of whose lengths 
is arbitrarily small. It follows that this set is of measure zero, i.e., that 
Ap < +00 almost everywhere. 

To prove that 4; > Ap almost everywhere, let the numbers c, C, 
e and the set E, be the same as in Lemma 2. It will then follow that 
Az > Ap almost everywhere if we succeed in showing that w(E,) = 0, 
since the set of points where A, < Ag can clearly be represented as the 
union of no more than countably many sets of the form E, (why ?). 
Let u(Z,) = t. Then, given any « > 0, there is an open set G, equal 
to the union of no more than countably many open intervals (a,, 5,) 
such that E, © Gand 


DY (bh, — a) <t+e 
7 





5 Note that Az cannot equal — 0, since the difference quotient (3) is inherently non- 
negative if fis nondecreasing. 


SEC. 31 DIFFERENTIATION OF THE INDEFINITE LEBESGUE INTEGRAL 323 


(this follows from the very definition of Lebesgue measure on the line). 
If 


t, = wlE, O (ay, 5,)]; 


k 
But t, < e(b, — a,), by Lemma 2. Hence 
t< p> (bd, — ax) < e(t +), 
k 


then 


which implies t < pf, since « > 0 is arbitrary. This in turn implies 
t=0, since 0 <p <1. Therefore 4; > Ap almost everywhere, as 
asserted. 

Finally, to drop the requirement that f be continuous, we need only 
generalize Lemmas | and 1’ in the way indicated in Problem 6, noting that 
the proof continues to go through (check details). §j 


Remark. Despite its apparent complexity, the proof of Theorem 6 is 
based on simple intuitive ideas. For example, the finiteness of Ap (and A;) 
almost everywhere is easily made plausible. In fact, let f be continuous and 
nondecreasing on [a, b]. Then f maps [a, 5] into the interval [ f(a), f(6)], at 
the same time subjecting a small interval [x, &] at x to a “magnification” 
approximately equal to 


oe LQ = fee =f) 


But the interval [ f(a), f(5)] is finite, = hence y(x) cannot be infinite on a 
set of positive measure. As for the part of the proof based on Lemma 2, 
it merely says that if the intersection of a subset A < [a, b] with every interval 
(a, 8) has measure no greater than o(8 — «) for some fixed number p < 1, 
then A cannot have positive measure. 


31.3. Differentiation of anintegral with respect to its upper limit. Returning 
to the problem of differentiating the indefinite Lebesgue integral, we have 


TueoreM 7. Let f be any function summable on [a, b]. Then 


d fs 
oe f f(t) dt (12) 


exists and is finite for almost all x. 
Proof. As noted at the beginning of Sec. 31.1 
fO=f0 —f£.0, 


6 For an alternative proof, see Problems 7-9. 
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where f, and f_ are nonnegative summable functions, so that 


F(x) = [7p dt = °AOat— PP LOdt = FAX) — FX) 


is the difference between two nondecreasing functions F, and F,. But Fy 
and F, have finite derivatives almost everywhere, by Theorem 6, and 
hence so does F. fj 


We now evaluate the derivative (12), thereby giving an affirmative answer 
to the first of the two questions posed on p. 314: 


THEOREM 8. Let f be any function summable on [a, b]. Then 


d [x 
+ [70 dt = feo 
almost everywhere. 
Proof. Let 


F(x) = [*f@ at. 
Then it will be enough to show that 


f(x) > F(x) (13) 
almost everywhere for any summable function. In fact, changing f(x) 
to —f(x) in (13), we get 
f(x) > —F'Q) 
f() < F(X). (14) 
But (13) and (14) together imply the desired result 


and hence 


? d fe 
fo) = FO) = = J? po at 


(almost everywhere). 
To prove (13), we observe that if 


f(x) < F’@), 
then there are rational numbers « and 8 such that 
f@®<«<B< F(X). (15) 


Let E,, be the set of all x satisfying (15). Then, as we now show, 
u(E,,) = 0. Since the number of sets E,, is countable, this will imply 


utxif(x) < F’(x)} = 0 
and hence that (13) holds almost everywhere. 
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To prove that u(E,,) = 0, we first note that, given any e > 0, there is 
a 8 > 0 such that u(Z) < 8 implies 


| [,fO at 


(the existence of such a number 8 follows from the absolute continuity 
of the Lebesgue integral, proved in Theorem 6, p. 300).? Let G < [a, b] 
be an open set, made up of no more than countably many pairwise 
disjoint open intervals (a,, 5,), such that 


Eg S G, u(G) < u(E 9) + 8, 
and let x» be any point in G, = Ey, M (a, b,). Then 
FE) — FO%) - 
e — Xo 
for any point § > Xp sufficiently close to x9. Writing (16) in the form 
F(E) — BE > Flxo) — Bxo., 


we see that the point xp is invisible from the right with respect to the 
continuous function F(x) — Bx. It follows from Lemma 1 that G, is 
the union of no more than countably many pairwise disjoint open 
intervals (a, , b,,), where 


F(4,,) — Baz, < Fr,) + Bb,» 


F(b,,) — F (4,,) > B(d,, — %,,), 


<eé 





8 (16) 


ie., 


or equivalently 


[Daf dt > Ble, — @,)- (17) 


a, 


If 
S=UG,, b,,); 
kn 


then clearly 
E.g ese G, u(S) < u(E,g) + 8. 


Summing (17) over all the intervals (a, , 5,,), we get 


[, 10 ar = ff de > BY be, — ay,) = Bul). 
kyn* ky k,n 





7 In particular, F(x) is continuous. In fact, 


[F(x’) — FOo| = | [Froa| ee 
if |x’ — x] < 8. 
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On the other hand, 


[roa =f, roat fg soa 
< au(Eyg) + ¢ < a(S) + lo] d+. (18) 
Comparing (17) and (18), we get 
a(S) + lal 8+ © > Bu(S) 


or 


me 


Therefore £,g is contained in an open set of arbitrarily small measure (it 
can be assumed that |a| 3 < ). It follows that u(E,,)=0. §f 


Problem 1. Let x1, X2,...,Xy,... be the set of all rational points in 
[a, b], enumerated in any way, and let A, = 1/2”. Prove that the jump 
function 


f@)= YL An 


Bn <e 
is discontinuous at every rational point and continuous at every irrational 
point. 
Problem 2. Suppose we define a jump function by the formula 
LO) = 3 Ay, (19) 
rather than by the formula (2). Prove that f is continuous from the right, 
rather than from the left as in Theorem 4. 
Problem 3. Find the derived numbers of the function 
xsin? if x>0, 
f= x 


0 if x <0 
at the point x = 0. 


Problem 4. Find the points invisible from the left in Figure 18, p. 319. 
Problem 5. In Lemma 1, show that f(a,) = f(6,) if a, 4 a. 


Problem 6. Prove that the requirement that f be continuous on [a, b] can 
be dropped in Lemma 1, provided that 


1) The discontinuity points of f are all of the first kind; 
2) A point x & [a, b] is said to be invisible from the right (with respect 
to f) if there is a point & such that x» << § < b and 


max {f(xo — 0), f%o), fo + 0} <f(8)s 
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3) The inequality (5) is replaced by 
f (a + 0) < max { f(b, — 0), f(x), f (O, + 0}. 


State and prove the corresponding generalization of Lemma 1’. 


Problem 7. Let co 
> nC) =S() (20) 
be an everywhere convergent series, whose general term ¢,,(x) is nondecreasing 
(alternatively, nonincreasing) on [a, b]. Prove that (20) can be differentiated 
term by term almost everywhere, i.e., that 


> Pn(x) = f'(*) 
almost everywhere. ar 
Problem 8. Prove that every jump function has a zero derivative almost 


everywhere. 
Hint. Use Problem 7. 


Problem 9. Prove that the assumption that f be continuous from the left 
in Theorem 5 can be dropped if we define a jump function as a sum of a 
“left jump function” like (2) and a “right jump function” like (19). Use 
this fact and Problem 8 to complete the proof of Theorem 6 without recourse 
to Problem 6. 

Hint, Use Problem 8 and Theorem 5. 


Problem 10. Following van der Waerden, let 
x if O<x<h, 
P(x) = ; 
I1—*x if ¢<x<l, 


and continue ¢» by periodicity, with period 1, over the whole x-axis. Then 
let 


x(x) = 7 yo 004") (n = 1,2,...), 


f) = > n(x) 


Prove that 
a) The function fis continuous everywhere; 
b) The derivative of f fails to exist at every point x» € (— 00, 00). 


Hint. Consider the increments 


tlw +5) =f) 


1 
qn 
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32. Functions of Bounded Variation 


The problem of differentiating a Lebesgue integral with respect to its 
upper limit has led us to consider functions that can be represented as 
differences between two monotonic functions. We now give a different 
description of such functions (independent of the notion of monotonicity), 
afterwards studying some of their properties. 


DEFINITION 1. A function f defined on an interval [a, b] is said to be 
of bounded variation if there is a constant C > 0 such that 


Zi/Ga) — fra! < @) 
for every partition 7 
A=X<Xy<c+ <x, =+b (2) 
of [a, b] by points of subdivision Xo, X1,..., Xn- 


Example. Every monotonic function is of bounded variation, since the 
left-hand side of (1) equals | f(b) — f(@)| regardless of the choice of partition. 


DEFINITION 2. Let f be a function of bounded variation. Then by the 
total variation of f on [a, b], denoted by V*(f), is meant the quantity 


vf) = sup > [fy — fal, (3) 
k=1 


where the least upper bound is taken over all (finite) partitions (2) of the 
interval [a, b]. 


Remark I. A function f defined on the whole real line (— co, 00) is said 
to be of bounded variation if there is a constant C > 0 such that 
Vat) < C 
for every pair of real numbers a and b (a < 5). The quantity 
lim V2(f) 


a->—-@ 


is then called the total variation of f on (— 00, 00), denoted by V2(f). 


Remark 2. It is an immediate consequence of (3) that 


Velaf) = lol Va(f) (4) 
for any constant «. 
THEOREM 1. If f and g are functions of bounded variation on [a, 6], 
then so is f + g and 
Ve(f + 8) < VaCf) + Vals). (5) 
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Proof. For any partition of the interval [a, b], we have 
x fq) + &(Xz) — f (Xe) — 8(%n-1)1 
< 2 fn) — fn) + x |g(X) — g(X%-1)I- 


Taking the least upper bound of both sides over all partitions of 
[a, b], and noting that 


sup {x + y:x€ A, y € B} < sup {x:x € A} 4+ sup {y:y € B}, 
we immediately get (5). J 


It follows from (4) and (5) that any linear combination of functions of 
bounded variation is itself a function of bounded variation. In other words, 
the set of all functions of bounded variation on a given interval is a linear 
space (unlike the set of all monotonic functions). 


THEOREM 2. Ifa <b < c¢, then 


Vat) = Vat) + Vil). (6) 


Proof. First we consider a partition of the interval [a, c] such that 
b is one of the points of subdivision, say x, = b. Then 


S1fOa) — foal 
k=1 


=S1fe) —f%a)| + 3 ree —ferw<VigNh+ Vi). M 


Now consider an arbitrary partition of [a, c]. It is clear that adding an 
extra point of subdivision to this partition can never decrease the sum 


n 
ZI (4) — f%-a)I- 
Therefore (7) holds for any subdivision of [a, c], and hence 


Vat) < Vat) + Vil). (8) 


On the other hand, given any « > 0, there are partitions of the intervals 
[a, b] and [8, c], respectively, such that 


TUCO —FOLI > Val) — 55 


TUE) — (EDM > WG) - 
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Combining all points of subdivision x/, x;, we get a partition of the 
interval [a, c], with points of subdivision x,, such that 

x If) — fra = YF — fOsa)l + p If) — Osa) 
> Vat) + Vif) — 
Since « > 0 is arbitrary, it follows that 
Vif) > Vet) + Vif). (9) 
Comparing (8) and (9), we get (6). ff 


CoroLLary. The function 


v(x) = Valf) (10) 


is nondecreasing. 


Proof. An immediate consequence of (6), since the total variation of 
any function of bounded variation on any interval is nonnegative. J 


THEOREM 3. Let f be a function of bounded variation on [a, b], and let 
v be the function (10). Then if f is continuous from the left at a point x*, 
So is v. 


Proof. Given any ¢ > 0, use the fact that fis continuous from the left 
to choose a § > 0 such that 


If e*) — FO) < 7 (11) 


whenever x* — x < 8. Then choose a partition 


Aa=Xy<xXy<i++ <x, = x* 
such that 


ve"(f) — Siew) — fOr <5. (12) 


Here it can be assumed that 
be Xn < 8, 


since otherwise we need only add an extra point of subdivision which can 
never increase the left-hand side of (12). It follows from (11) and (12) 
that 
n-1 
ves) — 2 f(x) —fCavI <s, 
and hence 


VE(f) — VIS) <e 


v(x*) — v(x,_1) < 


a fortiori, i.e., 
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But then, since v is nondecreasing, 

v(x*) — v(x) <e 
for all x such that x,_, < x < x*. In other words, v is continuous from 
the left atx*, ff 


Remark. Virtually the same argument shows that if fis continuous from 
the right at x*, then so is v. Together with Theorem 3, this shows that if 
fis continuous at x*, or on the whole interval [a, 5], then so is v. 


THEorEM 4. If f is of bounded variation on [a, b}, then f can be rep- 
resented as the difference between two nondecreasing functions on [a, b]. 
Proof. Let 
o(x) = Va(f), 


fo aa 
Then g is nondecreasing. In fact, if x’ < x”, then 


B(x") — g(x’) = fo") — 0%) — LA") — fr'D. (13) 


If") —F£&D < v(x") — ox’), 
by the very definition of v, and hence the right-hand side of (13) is 
nonnegative. Writing 


and consider the function 


But 


=—=v— 2; 
we get the desired representation of f as the difference between two 
nondecreasing functions. Jj 


CoroLLary 1. Every function of bounded variation has a finite derivative 
almost everywhere. 


Proof. An immediate consequence of Theorem 6, p. 321. §f 
Coroiary 2. If f is summable on [a, b), then the indefinite integral 


(x) = [fat 
is a function of bounded variation on [a, 6). 
Proof. Recall the remarks at the beginning of Sec. 9.1. fj 
Problem 1. Prove that V2(/) = 0 if and only if f(x) = const on [a, 5]. 
Problem 2. Prove that the function 
x* sin + if O0<x <1, 
S() = x 

0 if x=0 
is of bounded variation on [0, 1] if « > ® but not ifa < 8. 
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Problem 3. Suppose f has a bounded derivative on [a, b], so that f’(x) 
exists and satisfies an inequality | f’(x)| < C at every point x € [a, b]. Prove 
that fis of bounded variation and 


VAP) < Cb —a). 


Problem 4. Prove that if f and g are functions of bounded variation on 
[a, b], then so is fg and 


Valse) < Veaf) sup le(x)| + Vala) sep. FO). 
Problem 5, Let f be a function of bounded variation on [a, b] such that 
fw>ec>09. 
Prove that 1/fis also a function of bounded variation and 
ve (;) < 4 Vif). 
Problem 6. Prove the converse of Theorem 4. 
Problem 7. Prove that a curve 
y=f@)  @<x<b) 

is rectifiable, i.e., has finite length, as defined in Problem 3, p. 114, if and 
only if f is of bounded variation on [a, 5]. 


Problem 8. Let f be a function of bounded variation on [a, b]. Prove that 


Ifl =Veaf) 
has all the properties of a norm (cf. p. 138) if we impose the extra condition 
f(a) =0. 


Comment. Thus the space V?,,, of all functions of bounded variation 
on [a,b] equipped with this norm and vanishing at x =a is a normed 
linear space (addition of functions and multiplication of functions by 


numbers being defined in the usual way). 


Problem 9. Prove that the space V/, ,, defined in the preceding comment 
is complete. 


Problem 10. Does there exist a continuous function which is not of 
bounded variation on any interval? 


Hint. Recall Problem 10, p. 327 and Corollary 1 above. 
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33. Reconstruction of a Function from Its Derivative 


33.1. Statement of the problem. We now address ourselves to the second 
of the problems posed on p. 314, i.e., we look for the largest class of functions 
F such that 


[PFO dt = F(x) — Fla), () 
or equivalently 


F(x) = F(a) + ih * F(t) dt. (2) 


(As we know from calculus, these formulas hold if F is continuously differ- 
entiable.) From the outset, we must restrict ourselves to functions F which 
are differentiable (i.e., have a finite derivative) almost everywhere, since 
otherwise (2) would be meaningless. Every function of bounded variation 
has this property (see Corollary 1, p. 331). Moreover, the right-hand side of 
(2) is a function of bounded variation (see Corollary 2, p. 331). It follows 
that the largest class of functions satisfying (2) must be some subset of the 
class of functions of bounded variation. Since every function of bounded 
variation is the difference between two nondecreasing functions (see Theorem 
4, p. 331), we begin by studying nondecreasing functions from the standpoint 
of formula (1). 


THEOREM 1. Let F be a nondecreasing function on [a,b]. Then the 


derivative F’ is summable on [a, b] and 


Pr dt < F(b) — F(a). (3) 
Proof. Let 


®,(1) = nl F(e Ee ;) = ro! Get 5 


where, to make ®,,(t) meaningful for all ¢ € [a, b], we get F(t) = F(b) 
forb<t<b +1, by definition.® Clearly 


= 


iV iz= 
F'(t) =lim (ae 


n> 00 


=lim®,(1) 


Sle 


almost everywhere on [a, 5]. Since F is summable on [a, b], by Theorem 





® Verify that this does not affect the validity of the proof. 
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1, p. 316, so is every ,. Integrating ®,, we get 


’ b 1 b+(1/n) b 
® = = = an 
i} n(t) dt nf" [F(+ ie Fo] dt= nf fa F@at [roar] 
b+(1/n) a+(1/n) 
= nl [rome aar— [RD ar| < F(b) — F(a), 
where in the last step we use the fact that F is nondecreasing. The 
summability of F’ and the inequality (3) now follow at once from Fatou’s 


theorem (Theorem 3, p. 307). 


Example 1. It is easy to find nondecreasing functions F for which (3) 
becomes a strict inequality, i.e., such that 


“F(t dt < F(b) — F(a). (4) 
For example, let 
0 if O<t< i, 
F(t) = 
1 if $<t<l 


Then 
0= fro dt < F(1) — F(0) =1. 


Example 2 (The Cantor function). In the preceding example, F is discontin- 
uous. However, it is also possible to find continuous nondecreasing functions 
satisfying the strict inequality (4). To this end, let 


[ay’, 6] = E, #] 
be the middle third of the interval [0, 1], let 
[af (2) | ae [$, $ [a?”, bY] _ [3.3 8 

be the middle thirds of the intervals remaining after deleting [a’, b9] from 
(0, 1], let 

[a{?, by] = [er Sr], fay, bP] = Ley, 4] 

fas”, bS”] = [24,37], Lay”, 62] = (4, 39) 
be the middle thirds of the intervals remaining after deleting [a%, b%], 
[a’’, b] and [ab] from [0, 1], and so on, with 

fay, OY}, Lag”, be, «5 [agia, bah] 


being the 2" intervals deleted at the mth stage. Note that the complement of 
union of all the intervals [a‘”’, b{")] . is the set of all “points of the second 
kind” of the Cantor set constructed in Example 4, p. 52, i.e., all points of the 
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Cantor set except the end points 
0,1, 3, 8.353535 3) +> (5) 
of the deleted intervals (together with the points 0 and 1). 
Now define a function 





Fy = *— if te fal”, 6”, 
so that 
Ft)= 4 if 4<1t< 3, 
4 if ¢ <t< 2, 
F(t) = 4 9 9 
a if ¢<t<8, 
} if #<t< 3, 
Fit) = if w<t<o, 
&06OUif Bet, 
060 6if H<t< #, 


and so on, as shown scheinatically in Figure 19. Then F is defined everywhere 
on [0, 1] except at points of the second kind of the Cantor set. Given any 
such point t*, let {t,} be an increasing sequence of points of the type (5) 
converging to ¢*, and let {1),} be a decreasing sequence of points of the same 
type converging to t* (why do such sequences exist ?). Then let 
F(t*) = lim F(t,) = lim F(t) 

(justify the equality of the limits). Completing the definition of F in this way, 
we obtain a continuous nondecreasing function on the whole interval [0, 1], 
known as the Cantor function. (Fill in some missing details.) The derivative 
F’ obviously vanishes at every interior point of the intervals [a‘”, b”], and 





Ficure 19 
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hence vanishes almost everywhere, since the sum of the lengths of these 
intervals equals 
$day bere 
(the Cantor set is of measure zero). It follows that 
0= ih ‘F(t) dt < F(t) — F(0) = 1. 


33.2. Absolutely continuous functions. We have just given examples of 
functions for which formula (1) does not hold. To describe the class of 
functions satisfying (1), or equivalently (2), we will need the following 


DEFINITION. A function f defined on an interval [a, b] is said to be 
absolutely continuous on [a, b] if, given any ¢ > 0, there isa 8 > 0 such that 


DI) —Sladl <e 
for every finite system of pairwise disjoint subintervals 


(4, by) [a, 5] (K =1,...,a) 
of total length 


2 (bi — ay) 
less than 8. 7 


Remark 1. Clearly every absolutely continuous function is uniformly 
continuous, as we see by choosing a single subinterval (a, b,) © [a, 5]. 
However, a uniformly continuous function need not be absolutely continuous. 
For example, the Cantor function F constructed in Example 2 of the preceding 
section is continuous (and hence uniformly continuous) on [0, 1], but not 
absolutely continuous on [0, 1]. In fact, the Cantor set can be covered by a 
finite system of subintervals (a,, b,) of arbitrarily small total length (why ?). 
But obviously 


> [F(b,) — F(a) = 1 


for every such system. The same example shows that a function of bounded 
variation need not be absolutely continuous. On the other hand, an absolutely 
continuous function is necessarily of bounded variation (see Theorem 2). 


Remark 2. In the definition, we can change “finite” to “finite or count- 
able.” In fact, suppose that given any <« > 0, there is a 3 > 0 such that 


Sify flap <e' << 


for every finite system of pairwise disjoint intervals (a,, b,) < [a, b] of total 
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length less than 3, and consider any countable system of pairwise disjoint 
intervals (a,, 8,) < [a, b] of total length less than 8. Then obviously 


= If.) —S(ea)l <e! 


for every n. Hence, taking the limit as n — oo, we get 


SUG) — fled < <e. 


THEOREM 2. If f is absolutely continuous on [a, b], then f is of bounded 
variation on [a, b]. 


Proof. Given any « > 0, there is a 5 > 0 such that 


2 fd — f(a)| <e 
for every system of pairwise disjoint intervals (a,, b,) © [a, b] such that 
(by — a) < 8. 
K=1 


Hence if [«, 8] is any interval of length less than 3, we have 


Va(f) <e. 
Let ~ 
A=X%< xX << xy Hb 


be a partition of [a, b] into N subintervals [x, 1, x,] all of length less 
than 5. Then, by Theorem 2, p. 329, 


Vif)<Ne<o. ff 


THEOREM 3. If fis absolutely continuous on [a, b], then so is af, where 
a is any constant. Moreover, if f and g are absolutely continuous on [a, 5}, 


then so is f + g. 


Proof. An immediate consequence of the definition of absolute con- 
tinuity and obvious properties of the absolute value. Jj 


It follows from Theorems 2 and 3 (together with Remark 1) that the set 
of all absolutely continuous functions on [a, b] is a proper subspace of the 
linear space of all functions of bounded variation on [a, b]. 


Tueorem 4. If fis absolutely continuous on [a, b], then f can be repre- 
sented as the difference between two absolutely continuous nondecreasing 
functions on [a, b}. 
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Proof. By Theorem 2, fis of bounded variation on [a, b], and hence 
can be represented in the form 


f= v—§&;, 
where 


uxy=Ve), s=o0-f 
are the same nondecreasing functions as in Theorem 4, p. 331. We now 
verify that v and g are absolutely continuous. Given any < > 0, let 3 > 0 
be such that 


Sify - flan <e<e 


for every finite system of pairwise disjoint subintervals (a,, b,) © [a, b] 
of total length less than 5. Consider the sum 


LC.) = v(a;)| = 3 [0(b,) — v(a,)], 


equal to the least upper bound of the sums - - 


= > If (es) — fea)! (6) 


taken over all possible finite partitions 


Ce 


On = Xng < Xn <6 < Xn mg = by 
of the intervals (a,, b)),..., (@,, 5,). The total length of all the intervals 
(Xz,1-1» Xx.) figuring in (6) is clearly less than 5, and hence the sum (6) is 
less than e’, by the absolute continuity of f. Therefore 
2, lei) — v(a,)| < e' <e, 


iLe., v is absolutely continuous on [a, b]. It follows from Theorem 3 
that g = v — fis also absolutely continuous on [a, b]. 


We now study the close connection between absolute continuity and the 
indefinite Lebesgue integral: 


THEOREM 5. The indefinite integral 


FO) = f° dt 


of a summable function f is absolutely continuous, 
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Proof. Given any finite collection of pairwise disjoint intervals 
(a; b,), we have 


n n 


Fir) — Fal = 3 [fe at] <¥ Papoide= fi, ola 


k=1 = 





But the last expression on the right approaches zero as the total length 
of the intervals (a,, b,) approaches zero, by the absolute continuity of 
the Lebesgue integral (Theorem 6, p. 300). §f 


Lemma. Let f be an absolutely continuous nondecreasing function on 
(a, b] such that f’(x) = 0 almost everywhere. Then f(x) = const. 


Proof. Since fis continuous and nondecreasing, its range is the closed 
interval [_f(a), f(5)]. We will show that the length of this interval is zero 
if f’(x) = 0 almost everywhere, thereby proving the lemma. Let E be 
the set of points x € [a, b] such that f’(x) = 0, and let Z = [a, b] — E, 
where 4(Z) = 0, by hypothesis. Given any « > 0, we find 3 > 0 such 
that 


p f(b.) — fla) < (7) 


for any finite or countable system of pairwise disjoint intervals (a;, b,) < 
[a, b] of length less than 8 (recall Remark 2, p. 336), and then cover Z 
by an openset of measure less than 4 (this is possible, since Z is of measure 
zero). In other words, we cover Z by a finite or countable system of 
intervals (a,, b,,) of total length less than 8. It then follows from (7) that 
the whole system of intervals, and hence (a fortiori) the set 


Zc U (ay, b,); 
k 


is mapped into a set of measure less than ¢. But then u[ f(Z)] = 0, 
since « > 0 is arbitrary. 
Next consider the set E = [a,b] — Z, and let x»¢ E. Then, since 
T'(%o) = 0, we have 
f(x) =f) _, 
x — Xp 


for all x > xp sufficiently near x, i-e., 


I(x) — f(%o) < (x — Xo) 
EXy — f (Xp) < ex — f(x). 


Therefore the point x9 is invisible from the right with respect to the 
function ex — f(x). It follows from Lemma 1, p. 319 that £ is the 
union of no more than countably many pairwise disjoint intervals («,, B,), 


or 
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with end points satisfying the inequalities 


ea, — f (&,) < eB, — f(B,) 
S (Bx) — fe) < &(By — oy). 


or 


But then ~ 
> (f(8.) — fn] < © d (Be — %) < e(b ~— a). 

k k 

In other words, f maps E into a set covered by a system of intervals of 
total length less than «(6 — a). Therefore u[f(£)] = 0, since « > 0 
is arbitrary. 

We have just shown that the sets {(Z) and f(£) are both of measure 
zero. But the interval [ f(a), f(5)] is the union of f(Z) and f(£). It 
follows that [ f(a), f(6)] is of length zero, ie., that f(x) = const. §f 

We are now in a position to prove 


THEOREM 6 (Lebesgue). If F is absolutely continuous on [a, b], then 
the derivative F’ is summable on [a, b] and 


F(x) = F(a) + is F(t) dt. (8) 
Proof. We need only consider the case of nondecreasing F (why ?). 
Then F’ is summable, by Theorem 1, and the function 
®(x) = F(x) — i * F(t) dt (9) 
is also nondecreasing. In fact, if x” > x’, then 
(x") — O(e') = F(x") — F(X’) — [7 FW) dt > 0, 


where we again use Theorem 1. Moreover, ® is absolutely continuous, 

being the difference between two absolutely continuous functions (recall 

Theorems 3 and 5), and ®’(x) = 0 almost everywhere, by Theorem 8, 

p. 324. It follows from the lemma that ®(x) = const. Setting x = a, 

we find that this constant equals F(a). Replacing ®(x) by F(a) in (9), 

we get (8). § 

Remark, Combining Theorems 5 and 6, we can now give a definitive 

answer to the second of the questions posed on p. 314 (see also p. 333): 
The formula 


[7F@ dt = F@) — Fea), 


or equivalently, 
F(x) = F(a) + [*F'(0) dt, 


holds for all x € {a, b] if and only if F is absolutely continuous on {a, b). 
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33.3. The Lebesgue decomposition. Let f be a function of bounded varia- 
tion on [a, b]. Then it follows from Theorem 4, p. 331 and Problem 9, p. 327 
that f can (in general) be represented as a sum 


f(x) = 9%) + YO), (10) 


where ¢ is a continuous function of bounded variation and is a jump 
function.? Now let 


(x) = feo dt, (11) 


G(x) = 9(x) — 9,(x). 
Then 9, is absolutely continuous, while ¢, is a continuous function of bounded 
variation such that 


He) = 9) ~£ Fema=o 


almost everywhere. A continuous function of bounded variation is said to 
be singular if its derivative vanishes almost everywhere. For example, the 
Cantor function F constructed in Example 2, p. 334 is singular. Combining 
(10) and (11), we find that a function f of bounded variation can (in general) 
be represented as a sum 


S(&) = el®) + gale) + HO) (12) 


of an absolutely continuous function 9, a singular function ¢, and a jump 
function ). Formula (12) is known as the Lebesgue decomposition. 


Remark. Differentiating (12), we get 
f°) = 91%) 


almost everywhere. Thus integration of the derivative of a function of 
bounded variation does not restore the function itself, but only its absolutely 
continuous “‘component,”’ while the other two components, i.e., the singular 
function and the jump function, “disappear without a trace.” 


Problem 1, Prove that a function f is absolutely continuous on [a, b] if 
and only if it is a continuous function of bounded variation mapping every 
subset Z <[a, b] of measure zero into a set of measure zero. 


9 Generalizing Problem 9, p. 327, by a jump function, we now mean a function of 


the form 
> Ay, 4 > Ay, 
2, <2 an<e 
where the numbers ,..., /n,... and Ai, ..., Wn, ... corresponding to the discon- 
tinuity points x,,...,%n,... and x;,...,%n,... Satisfy the conditions 


Sli <0, Yh <0 
n n 


(we now allow negative hn, hn). 
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Problem 2. Verify directly from the definition on p. 336 that the function 


xsin 4 if x0, 
fea) * 
0 if x=0 


fails to be absolutely continuous on any interval [a, 5] containing the point 
x=0. 


Problem 3. Prove that if a function f satisfies a Lipschitz condition 


IF") — fx") < K |x! — x" 


for all x’, x” € [a, b], then fis absolutely continuous on [a, 5]. 


Problem 4. Prove that each of the terms 9, 9, and in the Lebesgue 
decomposition (12) is unique to within an additive constant. 


Comment. The stipulation “to within an additive constant’? can be 
dropped if we require the function fand its ““components”’ to vanish at x = a, 
say, or if we agree to regard all functions differing by a constant as equivalent. 


Problem 5. Let A?, ,, be the space of all absolutely continuous functions 
f defined on [a, 5], satisfying the condition f(a) = 0. Prove that A?, ,, is 
a closed subspace of the space V?, ,, of all functions of bounded variation 


on [a, 5] satisfying the same condition, equipped with the norm||f|| = V>(f). 


Comment. There is no need for the condition f(a) = 0 if we regard all 
functions differing by a constant as equivalent. We then have || f || = 0 if 
and only if f = const. 


Problem 6. Starting from a locally summable function f, i.e., a function 
summable on every finite interval, defined the corresponding generalized 
function f and generalized derivative f’ by the formulas 


(fe) = [" Fd @() dx, 


(f,) =—f[° fee dx 


as in Sec, 21.2. (Here 9 is any test function, i.e., any infinitely differentiable 
function of finite support.) Prove that the generalized derivative f’ determines 
f to within an additive constant. Apply this to the case of the function 


0 if x <0, 
f@M={(F) if O<x<l, 
1 if x>1, 


where F is the Cantor function constructed in Example 2, p. 334. 
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Hint. See Theorem 1, p. 213. 


Problem 7. Let f and f’ be the same as in the preceding problem, and 
suppose f is of bounded variation on (—oo, 00). Then f has an ordinary 
derivative almost everywhere. Let f, be the generalized function corre- 
sponding to df /dx, so that 


o d 
Gos a ota) dx. 
Prove that 


a) In general, f, does not equal the generalized derivative /’; 

b) If fis absolutely continuous, then f, = f’; 

c) Iff, =/’, then fis equivalent to an absolutely continuous function’ 
and, in particular, is absolutely continuous if it is continuous. 


Hint. In a), consider the function 


fs) 0 if x<0, 
xy= 
1 if x>0. 

Comment. Problems 6 and 7 further illustrate the situation discussed 
on pp. 206-207. To carry out the operations of analysis (in this case, recon- 
struction of a function from its derivative), we can either restrict the class of 
admissible functions (by requiring them to be absolutely continuous) or else 
extend the notion of function itself (at the same time, extending the notion 
of a derivative). 


34. The Lebesgue Integral as a Set Function 


34,1, Charges. The Hahn and Jordan decompositions. As we now show, 
the theory developed in Secs. 31-33 for functions defined on the real line 
(— oo, 00) continues to make sense in a much more general setting. Let ¥ 
be a space (i.e., some “master set’?) equipped with a measure uw, and let f 
be a yw-summable function defined on X. Then f is summable on every 
measurable subset E ¢ X, so that the integral 


OE) = ff du () 


(for fixed f) defines a set function on the system %, of all u-measurable 
subsets of X. By Theorem 4, p. 298, ® is o-additive, i.e., if a measurable 
set £ is a finite or countable union 


E=UE, 
n 





© Le., coincides almost everywhere with an absolutely continuous function. 
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of pairwise disjoint measurable sets E,,, then 
D(E) = Y OCE,). 


In other words, the set function (1) has all the properties of a o-additive 
measure except that it may not be nonnegative in the case where f takes 
negative values. These considerations suggest 


DEFINITION 1. A o-additive set function © defined on a o-ring (in 
particular, a c-algebra) of subsets of a space X and in general taking 
values of both signs is called a signed measure or charge (on X). 


Remark. Thus the notion of a measure is equivalent to that of a non- 
negative charge. 


In the case of electrical charge distributed on a surface, we can divide 
the surface into two regions, one carrying positive charge (i.e., such that 
every part of the region is positively charged) and one carrying negative 
charge. We will establish the mathematical equivalent of this fact in a 
moment, after first introducing 


DEFINITION 2. Let ® be a charge defined on a o-algebra FS of subsets 
of a space X. Then a set A — X is said to be negative with respect to D 
ifE AAES and OE A A) <0 for every Ee S. Similarly, A is said 
to be positive with respect to D if EAAe YS and D(E ON A) > 0 for 
every EES. 


THEOREM 1. Given a charge ® ona space X, there is a measurable set 
A-& X such that A~ is negative and A+ = X — A- is positive with 
respect to ®. 


Proof. Let 
a = inf O(A), 


where the greatest lower bound is taken over all measurable negative 
sets A. Let {A,} be a sequence of measurable negative sets such that 


lim O(4,) = A. 
Then eis 
A-=UA, 
is a measurable negative set such that 
O(4-) =a 
(why ?). To show that A- is the required set, we must now prove that 


A+ = X — Ais positive. Suppose A+ is not positive. Then A+ contains 
a measurable subset By such that ®(B,) < 0. However, By cannot be 
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negative, since if it were, the set A = A~ U B, would be a negative set 
such that ®(4) < a, which is impossible. Hence there is a Jeast positive 
integer k, such that B, contains a subset B, satisfying the condition 


1 
O(B = 
( ae 


Obviously B, ~ By. Applying the same argument to the set By — B,, 
we find a least positive integer Xk, such that By — B, contains a subset B, 
satisfying the inequality 


1 
O(B,) > Pe (ky > ky) 
2 


(explain why k, > k,), a least positive integer 3 such that (By — B,) —B, 
contains a subset B, satisfying the inequality 


1 
®(B3) > es (ks > ke), 
and so on. Now let : 


F = By — UB,. 


n=1 


Clearly F is nonempty, since D(B,) < 0 while O(B,) > 0 for alln > 1. 
Moreover, F is negative by construction (think things through). Hence 
the set 4 = A- U Fis again negative and (A) < a, which is impossible. 
This contradiction shows that At = X — A must be positive. JJ 
Thus we can represent X as a union 
X= A+ U A- (2) 
of two disjoint measurable sets A+ and A-, where ‘At is positive and Aq is 
negative with respect to the charge ©. The representation (2) is called the 
Hahn decomposition of X, and may not be unique. However, if 
X = A} VU AZ, X=A}UAP 
are two distinct Hahn decompositions of X, then 
QE ON AZ) = D(E N AZ), =0§O(EN ADT) = OE 2 Af) (3) 
for every Ee S. In fact, 


E O(A4y — AZ) GC EN AZT (4) 
and at the same time 
E (47 — Ax) CG EOC Sf. (5) 
But (4) implies 
OE 1 AG — Az)) < 9, 
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while (5) implies 

O(E 1 (AT — Az)) > 0. 
Therefore 

O(E N (Ay — Az)) = 0, (6) 
and similarly 

O(E N (Az — A7)) = 0. (7) 


It follows from (6) and (7) that 
O(E A Az) = O(E A (Az — Az) + OE (Az AQ) 
= O(E N (Az — Aq) + DCE (AT OO Az)) = OE 2 AQ), 


which proves the first of the formulas (3). The second formula is proved 


in exactly the same way. 
Thus a charge ® on a space X uniquely determines two nonnegative set 


functions, namely 
OH(LE)=OEN A), O(E)=—O(EN A), 


called the positive variation and negative variation of ®, respectively. It is 
clear that 
1) o0=0+-O; 
2) ®t and ®- are nonnegative o-additive set functions, i.e., measures; 
3) The set function |®] = O+ + ®-, called the total variation of ®, is 
also a measure. 
The representation 
® = O+— O- 
a charge ® as the difference between its positive and negative variations 
is called the Jordan decomposition of ®. 


34.2. Classification of charges. The Radon-Nikodym theorem. We now 
classify charges on a space X equipped with a measure: 


DEFINITION 3. Let u be a o-additive measure on a o-algebra SF, of 
(u-measurable) subsets of a space X, and let © be a charge defined on F,,. 
Then ® is said to be concentrated ona set Ae SF, if D(E) = 0 for every 
measurable set E< X — A. 


DEFINITION 4. Let », %,, X and ® be the same as in Definition 3. 
Then ® is said to be 


1) Continuous if O(E) = 0 for every single-element set E <. X of 
measure zero; 
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2) Singular if D is concentrated on a set of measure zero; 

3) Discrete if © is concentrated ona finite or countable set of measure 
zero; 

4) Absolutely continuous (with respect to w) if D(E) = 0 for every 
measurable set E such that p(E) = 0. 


Clearly, the Lebesgue integral 
O(E) = [ 9c) du. 


of a fixed summable function ¢ is absolutely continuous with respect to the 
measure .. As we will see in a moment, every absolutely continuous charge 
can be represented in this form. But first we need the following 


Lemma. Let be a o-additive measure defined on a o-algebra FS, of 
subsets of a space X, and let D be another such measure defined on &,. 
Suppose ® is absolutely continuous with respect to u. and is not identically 
zero. Then there is a positive integer nandaset A € F, such that u(A) > 0 
and A is positive with respect to the charge ® — (1/n)p. 


Proof. Let 
X=A, UAL (n=1,2,...) 


be the Hahn decomposition corresponding to the charge ® — (I/n)u, 
and let 
A,=N4,, AG =UAt. 
n=l n=1 
Then 
Ben a (M4 SS 
(AQ) < 7 Ao) 


for all n=1,2,...,ie., O(4>)=0, and hence ®(At) > 0 since 
X = Az U At and © is not identically zero. But then (At) > 0, by 
the absolute continuity of ©. Hence there is an n such that u(At) > 0 
(why ?). This 1 and the set A = A® satisfy the conditions of the lemma. 


THEOREM 2 (Radon-Nikodym). Let u be a o-additive measure defined 
ona o-algebra &, of subsets of a space X, and let D be a charge defined on 
SF, Suppose © is absolutely continuous with respect to wu. Then there is a 
u-summable function @ on X such that 


O(E) = | ox) ay (8) 


for every Ee &. The function ¢ is unique to within its values on a set 
of .-measure zero. 


348 DIFFERENTIATION CHAP. 


Proof. We can assume that ® is not identically zero, since otherwise 
we need only choose ¢ to be any function equal to zero almost everywhere 
(discuss the uniqueness of ¢ in this case). Let K be the set of all u- 
summable functions on X such that 

[ f@) dp < OE) 
for every Ee &, and let 


M= Sup ff) du. 


Moreover, let {f,,} be a sequence of functions in K such that 


tim f fue) du = M, (9) 
and let 
£n(x) = max {f,(x),... > f,()}- 
Then clearly 
8X) < Bax) << BX) <-'e. 
Moreover, 


[ ene) du < @(E) (10) 
for every Ee &,. In fact, E can be written in the form 
E=UE,, 
k=1 


where the sets E,,...,£, are pairwise disjoint and g,(x) = f;,(x) on 
E,, and hence 


fen) du => J fle) du < DO(E,) = O(E). 
E r=1°H k=1 
In particular, it follows from (10) that g, € K, so that 


J Bal) du<M. 
But then 


lim J &a(%) du = M, 
since otherwise 
us J fal) du < lim J aCe) du <M, 
contrary to (10). Writing 
(x) = sup g,(x), 
we find that é 
(x) =e SnlX); 
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and hence, by Levi’s theorem (Theorem 2, p. 305), 


[,9) du =lim J ga(x) du = M. (11) 


Next we show that ¢ is the required function, figuring in the repre- 
sentation (8). By construction, the set function 


ME) = ®(E) — J 9x) du 


is nonnegative and in fact is a o-additive measure. If ACE) # 0, then, 
by the lemma, there is an « > 0 and a set Ae & such that (A) > 0 


and 
eu(EM A) < AEN A) 


for every Ee F,. Let 
A(x) = 9(%) + ex4(%), 
where?! 
() 1 if x € A, 
x)= 
0 ifx¢éA. 
Then 


face) du = f9@ du + eu(E AO A) 


< [9 du + OE 9 A) < O(8), 


so that A belongs to the set K introduced at the beginning of the proof. 
On the other hand, it follows from (11) that 


J A) du = f oC) du + eu(A) > M, 


contrary to the definition of M. Therefore A(Z) = 0, which is equivalent 
to (8). 

Finally, to prove that ¢ is unique to within its values on a set of 
measure zero, suppose 


(E) = [00 du = [o*(x) du 


for all Ee X,. Then, by Chebyshev’s inequality (Theorem 5, p. 299), 
we have 


wAn) < mJ (ox) — o*@)] du =0 





11 y, is called the characteristic function of the set A. 
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for every set 


Arn = [»:900) ~ 9*(x) > _ (m =1,2,..), 
m 
and similarly 
u(B,) = 0 
for every set 
a [:9%@9 6S | Geen: 
But 


{x:o(x) ¥ 9*(x)} = (UA, ) U (UB,), 
and hence c ) € ) 
u{x: p(x) # 9*(x)} = 0, 


ie., o(x) = o*(x) almost everywhere. ff 


Remark 1. The function ¢ figuring in the representation (8) is called the 
Radon-Nikodym derivative (or simply the density) of the charge ® with 
respect to the measure u, and is denoted 


d® 

du’ 
Clearly, Theorem 2 is the natural generalization of Lebesgue’s theorem 
(Theorem 6, p. 340), which states that an absolutely continuous function 
F is the integral of its own derivative F’. However, in the case of a function 


F defined on the real line there is an explicit procedure for finding the 
derivative of F at a point x9, namely evaluation of the limit 


lim AF aoe F(x) + Ax) — Fo) 
Az>0 AX Ayo Ax 


whereas the Radon-Nikodym theorem only establishes the existence of the 
derivative d®/du, without telling how to find it. However, an explicit 
procedure can be given for evaluating dO/dy. at a point x9 € X by calculating 
the limit 

lim 2) : 

20 u(A.) 
where {A,} is a system of sets “converging to the point x,” as ¢—0, ina 
suitably defined sense.1? 





12 For the details, see G. E. Shilov and B. L. Gurevich, Integral, Measure and Deriv- 
ative: A Unified Approach (translated by R. A. Silverman), Prentice-Hall, Inc., Englewood 
Cliffs, N.J. (1966), Chap. 10. 
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Remark 2. It can also be shown" that an arbitrary charge ® has a unique 
representation as the sum 


©(E) = A(E) + S(E) + D(E) 


of an absolutely continuous charge A, a singular charge S and a discrete 
charge D. This is the exact analogue of the Lebesgue decomposition on 
p. 341. 


Problem 1. Given any charge ® defined on a o-algebra ¥, prove that 
there is a constant M > 0 such that |D(£)| < M forall Fe F. 


Problem 2. Give an example of two distinct Hahn decompositions of a 
space X. 


Problem 3. Prove that a charge ® vanishes identically if it is both 
absolutely continuous and singular with respect to a measure u. 


Problem 4. Prove that if a charge ® is concentrated on a set Ag, then so 
are its positive, negative and total variations. 


Problem 5. Prove that 


a) Every absolutely continuous charge is continuous; 
b) Every discrete charge is singular. 


Problem 6. Prove that if a charge ® is absolutely continuous (with 
respect to a measure 1), then so are its positive, negative and total variations. 


Problem 7. Prove that if a charge ® is discrete, then there are no more 


than countably many points x, x.,...,X,,... and corresponding real 
numbers fy, he,...,A,,... Such that u({x,}) = 0 and 
O(E) = > h,. 
ay 


Write expressions for the positive, negative and total variations of ®. 


Problem 8. Let X be the square 0 < x < 1,0 <_y < 1 equipped with 
ordinary two-dimensional Lebesgue measure , and let ®(£) be the ordinary 
one-dimensional Lebesgue measure of the intersection of E with the interval 
0<x <1. Prove that ® is continuous and singular, but not absolutely 
continuous. 


8 G. E. Shilov and B. L. Gurevich, op. cit., Chap. 9. 
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MORE ON INTEGRATION 


35. Product Measures. Fubini’s Theorem 


The problem of reducing double (or multiple) integrals to iterated integrals 
plays an important role in classical analysis. In the Lebesgue theory, the key 
result along these lines is Fubini’s theorem, proved in Sec. 35.3. En route 
to Fubini’s theorem we will need the preliminary topics treated in Secs. 35.1 
and 35.2, which are also of interest in their own right. 


35.1. Direct products of sets and measures. By the direct (or Cartesian) 
product of two sets X and Y, denoted by X x Y, we mean the set of all 
ordered pairs (x, y) where x EX, y € Y. Similarly, by the direct product of 
n sets X,, Xy,..., X,,, denoted by 


XXX, X° XX, (1) 
we mean the set of all ordered n-tuples (x,, x.,...,X,), Where x, € Xj, 
X_ EXy,..., x, €X,. In particular, if 

X,= X= =X, =X, 


we write (1) simply as X”, the “nth power of X.” 


Example 1. Real n-space R” is the nth power of the real line R', as 
anticipated by the notation. 


Example 2. The unit cube I” in n-space, i.e., the set of all elements of R” 
with coordinates satisfying the inequalities 


0<x,< 1 (A =1,2,...,n), 
is the nth power of the closed unit interval J+ = [0, 1). 
352 
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Now let 4, %,..., % be systems of subsets of the sets X,, Xz... , 
X,,, tespectively. Then by 
S=RXAX*xX & 


we mean the system of subsets of the direct product (1) which can be 
represented in the form 
A= A,X A, X°*+ X Ay, 


where 
AEA (k=1,2,...,0). 
If 
i= hoa = R=S, 
then G is the “nth power of #,” written 
S= F¥". 


For example, the system of all closed rectangular parallelepipeds in R” is the 
nth power of the system of all closed intervals in R?. 


THEOREM 1. If A, 4,..., AH are semirings, then so is the set 
G= HX AX xX KF. 


Proof. By the definition of a semiring (see p. 32), we must show that* 


a) If A, BEG, then AN BEG; 
b) If 4, Be S and BC A, then A can be represented as a finite 
union 


4=Uc# 
k=1 


of pairwise disjoint sets C” € S, with B= C%, 


It is clearly enough to prove these assertions for the case n = 2. Thus 
suppose AE AX A,B AX SH, Then 


A= A, X A, (4,€ 4, Age SH) 


2 
B= B, X B, (Be A, Be SH), - 


and hence 
ANB=(A, X Ag) NO (B, X Bo) = (Ay M AQ) X (Ag A Be). 


But 4, BE A, AZNB, E AH, since A and & are semirings. It 
follows that A N Be A X SH. This proves a). 
To prove b), suppose that 


Byo Ay, By As, 





1 Note that the empty set @ belongs to G, since @ = @ xX @ x--+ x S@ (why?). 
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in addition to (2). Then, since “ and &% are semirings, there are finite 
expansions 


A, = B, UBM U-? UBM, 
A, = B, U BY? U+++ U BS, 
where the sets B,, BY,..., Bi” are pairwise disjoint and belong to K, 
while the sets B,, BY,..., BY are pairwise disjoint and belong to 4. 
Therefore 
A= A, X A, =(B, X B,) U(B, X BY’) U--- UCB, x BY”) 
U(BY x B.) U(BY? x BY’) U-++ U(BY x BY) 
U+++ (By? x Ba) U (By” x By?) U+++ U (BY? x By”) 
is the desired finite expansion of A, X A,, where B, x B, is the first term 


and the other terms are pairwise disjoint and belong to G = 
FxA | 


Now let 4, H,..., A ben semirings, equipped with measures 


U1(Ay), U2(Aa), -- + 5 n(An) (A, € A), (3) 
and let u be the measure on the semiring G= %x AXx-::xX JF 
defined by the formula 

(A) = pr(Ay)¥2(A) ++ Un(An) 
forevery A = A, X A, X +++ X A,. Then wis called the direct (or Cartesian) 
product* of the measures (3), and is denoted by 
= Uy X Ue X “tt X Une 


To confirm that p is indeed a measure, we now show that yp is additive (u is 
obviously real and nonnegative). It will again be enough to consider the 
case n = 2. Suppose 


t 
A= A,X A,= UB", (4) 
k=1 
where 
Be n BO = Gi # j) 
and 
Be — BY x Bw, 
According to Lemma 2, p. 33, there are finite expansions 
Tr 8 


A=Uc””, A,=Uc, 


m=1 n=1 





* The term product measure will be used with a different meaning below. 
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each involving pairwise disjoint sets, such that each B is a finite union 
(i) 
BP = U cy” 


meM k 


of certain of the sets C!”, while each B® is a finite union 
Bw — U Ce 


neN, 


of certain of the sets C{”’ (here M;, denotes some subset of the set {1, 2,..., r} 
and N, some subset of the set {1,2,..., s}). But then, by the additivity of 
wu, and Us», we have 


w(A) = Uy(Az) pe Ae) =  wa(Cr™) 2 He(C2”) 
t 
=> > a(ci”) > w(cr) 
k=1 meM, neN, 


t t 
=X es(BY (Bs) = 2 (Br), 
which, when compared with (4), shows the additivity of u = wy X Ue 


Example 3. Thus the additivity of area of rectangles in the plane follows 
from the additivity of length of intervals on the line. 


THEOREM 2. If the measures Uy, 2, .- + Un are o-additive, then so is 
the measure w= Uy X Ue X°°* X Une 


Proof. Again we need only consider the case n = 2. Let A, denote the 
Lebesgue extension of the measure u,, and suppose 


c=Uc,, 


n=] 
where the sets C,, are pairwise disjoint and the sets C, C, belong to 
AX Fries, Clas (Ac % Bef), 
Ch = An X Br (A, € 4, B, € A). 
Moreover, let 
u(B,) if xe Ay, 
fal) = | 


0 if x¢€A,. 
We then have 


SA =p,(B) if xe, 
and hence, by the corollary on p. 307, 
S [ A@) dy = i. Yo(B) dry = 24(A)pa(B) 
= uy(A)yo(B) = uC). (5) 
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But 
[ fa) ads = wi(A,)ualB,) = WC). (6) 
Substituting (6) into (5), we get 


w(C) = > acy). i 


Again let A, H,..., 4% be n semirings, this time equipped with 
o-additive measures (3). Then it follows from Theorem 2 that the measure® 
mM = Uy X Ue X09 X Un (7) 
is o-additive on the semiring 
SG=FxX Axx &K 


Therefore, as in Sec. 27, m has a Lebesgue extension u defined on a o-ring 
SF, > S. This measure y is called the product measure of the measures (3), 
and is denoted by 

= Uy @ be @°'' @ Uy, (8) 


The distinction between the meaning of the symbols x and ® in (7) and 
(8) is crucial. 


Example 4. Let 
Yr = Ug = = B, = BH, 
where yu) is ordinary Lebesgue measure on the line. Then the product 
measure (8) is ordinary Lebesgue measure in n-space. 


35.2. Evaluation of a product measure. Let G be a region in the xy-plane 
bounded by the vertical lines x = a, x = b (a < b) and the curves y = f(x), 
y = g(x), where f(x) < g(x). Then it will be recalled from calculus that the 
area of G is given by the integral 


[tee — fe) ax, 


where the difference g(x») — f (Xp) is just the length of the segment in which 
the vertical line x = x, intersects the region G. As we now show, the natural 
generalization of this method can be used to evaluate an arbitrary product 
measure: 


THEOREM 3. Let u be the product measure 


¥ = Uz @ Uy, 





> We change to the symbol m here, to “free” p for use in formula (8). 
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of two measures , and w, such that 


1) u, is o-additive on a Borel algebra F,_ of subsets of a set X; 

2) w, is o-additive on a Borel algebra F,, of subsets of a set Y; 

3) U, and w, are complete, in the sense that B < A and p,(A) =0 
implies that B is measurable (with measure zero), and similarly for 


My. 
Then 
w(A) = | ts) dhe = ie Ha(A,) apy (9) 


for every u-measurable set A, where® 
A, = {y:(x%,y)€A} (x fixed), 
A, = {x:(%, y)€ A} (fixed). 
Proof. We note in passing that the integral over X in (9) reduces to 
an integral over the set of the form 


U4,co X 
y 


outside which y,(A,) vanishes (and similarly for the integral over Y). 
It will be enough to prove that 


WA) = fea) dite, (10) 
where 
a(x) = 4y(Az), 
since the other part of (9) is proved in exactly the same way. Observe 
that implicit in the theorem is the conclusion that the set A, is u,-measur- 
able for almost all x (in the sense of the measure u,) and that the function 
¢4(X) is u,-measurable, since otherwise (10) would be meaningless. 
The measure yp is the Lebesgue extension of the measure 
mM = Ug X by 
defined on the semiring %, of all sets of the form 
A=4A,XA, (AEX), 
where -¥, is the Borel algebra of u-measurable subsets of X x Y. But 
(10) obviously holds for all such sets, since for them 
Ulse,) if XEA,,, 


x= 
eG if x¢ 4, 





4 The Lebesgue extension of any measure is complete (see Problem 7, p. 280). 

5 If X is the x-axis and Y the y-axis (so that X x Y is the xy-plane), then 4,, is the 
projection onto the y-axis of the set in which the vertical line x = x, intersects the set 4 
(and similarly for A,,). 
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Moreover, (10) carries over at once to the ring A(%,) generated by 
f,, since R(.F,,) is just the system of all sets which can be represented 
as finite unions of pairwise disjoint sets of %, (recall Theorem 3, p. 34). 

To prove (10) for an arbitrary set A € %,, we recall from Theorem 8, 
p. 277 that there are sets 


Bure AS) (Bar © Bag S00 © Bap See’) 
and corresponding sets 
B,=UB,E€% (B, > B, > ++: > B,>-°°) 
such that 
Ac B=NB,, 
n 


u(A) = p(B). (11) 
Clearly, 


?p,(X) = im PBal®)s PBX) < PBa(X) < °° * < Pay) <7 


pp(x) =lim 9z,(x), Op(X) > 9p,(x)>°°° > B,(X) > °°. 
nro 


Hence we can invoke Levi’s theorem® to extend (10) from the ring Z(%,) 
to the system of all sets Be %, of the form 


nik 


Moreover if u(A) = 0, then u(B) = 0, because of (11), and hence 
Pp(x) = ¥,(Bz) = 0 

almost everywhere. Therefore A, is measurable and 
a(x) = Hy(As) = 0 


for almost all x, since A, © B,. But then 


fea) du, = 0 = u(A). 


In other words, (10) holds for all sets of measure zero, as well as for all 
sets of the form (12). But, according to (11), an arbitrary set A€ Sf, 


can be represented as 
A=B-—Z, 


where B is of the form (12) and Z is of measure zero. Therefore 


B=AUZ (ANZ=2). 





6 See Theorem 2, p. 305 and Problem 2, p. 311. 
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It follows that 


pA) = 2B) = J o5() dit 


= | palx) dee + f Pale) dite = f eal) dite 
ie., (10) holds for every Ac %. 


Example 1. Let M be any u,-measurable set, and let f be an integruble 
nonnegative function. Moreover, let Y be the y-axis, and let u, be ordinary 
Lebesgue measure on the line. Consider the set 


A= (x, y):xEM,0<y <f@} (13) 
Then 
f)  ifxeA, 


— A.) = 
p(x) By( a) ifx ¢ A, 


and hence, by Theorem 3, 


uA) = foals) dite = ff (3) dite (14) 


This allows us to interpret the Lebesgue integral of a nonnegative function 
over a set M < X in terms of the u-measure of the set (13), where uw = 
Yr ® By. 

Example 2. In the preceding example, let X be the x-axis and let M be a 
closed interval [a,b]. Moreover, suppose f is nonnegative and Riemann- 
integrable on [a, 5]. Then (14) reduces to the familiar formula 


mA) = [P FQ) dx 
for the area under the graph of the function y = f(x) between x = a and 
x=b. 


35.3. Fubini’s theorem. The next theorem is basic in the theory of 
multiple integration: 


THEOREM 4 (Fubini). Let u., and wu, be the same as in Theorem 3, let u 
be the product measure uz ® wy, and let f (x, y) be u-integrable on the set 
Ac&XX Y. Then 


J, Fos ndu= J ([) Ses ndur) die= J, (ff rte) dae (15) 


Proof. Note that implicit in the theorem is the conclusion that the 
“inner integrals” in parentheses exist for almost all values of the variable 
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over which they are integrated (x in the first case, y in the second). We 
begin by assuming temporarily that f(x, y) > 0. Consider the triple 
Cartesian product 
U=Xx YxZ, 

where Z is the real line, equipped with the product measure 

My = Ue @ by @ w= 2 @ ud = Uy @ (u, @ pw) 
(see Problem 3), where uw! is ordinary Lebesgue measure on the line. 
Moreover, consider the set W < U defined by 


W = {(x, y,z):x € Ay, yE ALO < 2 < f(x, y)}. 


By (14), 
wu) = ff, ») de. (16) 
On the other hand, by Theorem 3, 
pa) = | W,) dines (17) 
where 
A= Uy ®@ pw, 


W, = {(y, 2): (%, y, 2) © W} (x fixed). 
Using (14) again, we obtain 
\(W,) = oe (XY) diy. (18) 


Comparing (16)-(18), we get part of (15). The rest of (15) is proved in 
exactly the same way. To remove the restriction that f(x, y) be non- 
negative, we merely note that 


SO, y) = ft(x, y) —f, y)s 
If Yl+SO y) 


where the functions 


LeayY= ; 
Fen If yh fe y) 


are both nonnegative. Jj 


Remark. Thus Fubini’s theorem asserts that if the “double integral” 


1=[ fe,» du (19) 


exists, then so do the “‘iterated integrals” 


toy = [i (J_ $0 904s) dt tae = fp (J), 705 9) de) diy, 20) 


and moreover I = I,, = Iy,. 
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Problem 1. Give an example of a set in R® which is not a direct product 
of any two sets in R’. 


Problem 2. Prove that the direct product of two rings (or o-rings) need 
not be a ring (or o-ring). 


Problem 3. Given three spaces X, Y and Z, equipped with measures 
Mo» ¥y and w,, respectively, prove that (u, @ vy) @ yu, and wz ®@ (vy @ wz) 
are identical measures on X¥ X Y X Z. 


Problem 4. Let A = [—1, 1] X [—1, 1] and 


xy 
xy =>: 
f(y) (xP 4 yy? 
Prove that 
a) The iterated integrals (20) exist and are equal; 


b) The double integral (19) fails to exist. 


Hint. Since 
ie LS y) dx = f f(x, y) dy = 0, 


we have 


Pi([, fe 94%) ay = fF (Fes 94) ax =o. 


On the other hand, the double integral fails to exist, since 


[ures maedy > [arf BOO ag fA 


after transforming to polar coordinates. 


Problem 5. Let A = [0, 1] x [0, 1] and 








an . 1 1 1 1 
2 if jn pa? an <I yea? 
FOIE. ee) eh 1 o4 
a if gre S* Sone an SY Sn? 
0 otherwise. 


Prove that the iterated integrals (20) exist but are unequal. 


Ans. I Sire. y) ax) dy = 0, Li (Miree y) ay) dx =1. 


Problem 6. The preceding two problems show that the existence of the 
iterated integrals (20) does not imply either the existence of the double 
integral (19) or the validity of formula (15). However, show that the 
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existence of either of the integrals 


(L,U7c 91a) daw fe (fife a0lde.) dey @y 


implies both the existence of (19) and the validity of (15). 


Hint. Suppose the first of the integrals (21) exists and equals M. The 
function 


fu(%, y) = min {| f(x, y)|, 0} 


is measurable and bounded, and hence summable on A. By Fubini’s theorem, 


) a y) dp = ip Us SinlXs Y) ai,) du, < M. 


Moreover, {/,(x, y)} is a nondecreasing sequence of functions converging 
to | f(x, y)|. Use Levi’s theorem to deduce the summability of | f(x, y)| 
and hence that of f(x, y) on A. 


Problem 7. Show that Fubini’s theorem continues to hold for the case of 
o-finite measures (cf. Sec. 30.2). 


36. The Stieltjes Integral 


36.1. Stieltjes measures. Let F be a nondecreasing function defined on a 
closed interval [a,b], and suppose F is continuous from the left at every 
point of (a, 5]. Let / be the semiring of all subintervals (open, closed or 
half-open) of [a, 6), and let m be the measure on / defined by the formulas’? 


m(«, B) = F(B) — F(@ + 0), 
mla, B] = F(B + 0) — F(a), 
mla, B) = FB) — F(a). 
Finally, let up be the Lebesgue extension of m, defined on the o-algebra 
fF, of wy-measurable sets. In particular, ~,, contains all subintervals of 
[a, b) and hence all Borel subsets of [a, 5). Then uy is called the (Lebesgue-) 
Stieltjes measure corresponding to the function F, and the function F itself 
is called the generating function of wp. 


Example 1. The Stieltjes measure corresponding to the generating func- 
tion F(x) = x is just ordinary Lebesgue measure on the line. 


(1) 





7 To avoid confusion, we omit “outer parentheses,” writing u(«, 8) instead of u((«, 8)), 
and similarly in the rest of the formulas (1). Moreover, in m[a, 8], we allow the case 
a= 6. 
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Example 2, Let F be a jump function, with discontinuity points 


X4,Xo,...,Xy,--. and corresponding jumps /y, hy,...,h,,... . Then 
every subset A © [a, b) is up-measurable, with measure 
urA) = > Ay. (2) 
tneA 


In fact, according to (1), every single-element set {x,} has measure h,, and 
moreover it is clear that the measure of the complement of the set {x,, 
Xa,..-5%y,--.} is zero. But then (2) holds, by the o-additivity of up. A 
Stieltjes measure uw, of this type, generated by a jump function, is said to be 
discrete. 


Example 3. Let F be an absolutely continuous nondecreasing function on 
(2, 5), with derivative f= F’. Then the Stieltjes measure wp is defined on 
all Lebesgue-measurable subsets A © [a, b) and 


und) = ff) de. G) 
In fact, by Theorem 6, p. 340, 


wn(a, B) = F(B) — F(a) = [° fle) ax (4) 


for every open interval («, 8). But then (3) holds for every Lebesgue- 
measurable set A < [a, 5) since a Lebesgue extension of a o-additive measure 
is uniquely determined by its values on the original semiring.® A Stieltjes 
measure tp Of this type, with an absolutely continuous generating function, 
is itself said to be absolutely continuous. 


Example 4. Let F be singular (and continuous) as on p. 341. Then the 
corresponding Stieltjes measure 4» is concentrated on the set of Lebesgue 
measure zero where the derivative F’ is nonzero or fails to exist. A Stieltjes 
measure of this type is said to be singular. 


Example 5. By the Lebesgue decomposition (p. 341), an arbitrary 
generating function F can be represented as a sum 


F(x) = D(x) + A(x) + S(x) (5) 


of a jump function D, an absolutely continuous function A and a singular 
function S (verify that D, A and S are themselves generating functions). 
Moreover, each of the “components” D, A and S is uniquely determined to 
within an additive constant (see Problem 4, p. 342). But clearly 


Up =Up + ba + Us: 





8 Give a more detailed argument, recalling Problem 1, p. 279. Note that in this case 


m(2, 8) = mlx, B] = m(@, B] = m[x, ). 
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It follows that an arbitrary Lebesgue-Stieltjes measure can be represented 
as a sum of a discrete measure up, an absolutely continuous measure wy and 
a singular measure ug. Moreover, this representation is unique (why ?). 


Remark. We can easily extend the notion of a Stieltjes measure on a 
(finite) interval [a, 5) to that of a Stieltjes measure on the whole line (— 00, 00). 
Let F be a bounded nondecreasing function on (— 00, 00), so that 


m< F(x) < M (—0% <x < 0). 


Using the formulas (1) to define the measure of arbitrary intervals (open, 
closed or half-open), not just subintervals of a fixed interval [a, b), we get a 
finite measure uy on the whole line, called a (Lebesgue-) Stieltjes measure, 
as before. In particular, we have 


u(— 00, 00) = F(co) — F(— 0) 
for the measure of the whole line, where 


F(0o) = lim F(x), F(— 0) = lim F(x) 


(the existence of the limits follows from the fact that F is bounded and 
monotonic). 


36.2. The Lebesgue-Stieltjes integral. Let u be a Stieltjes measure on 
the interval [a, b), corresponding to the generating function F, and let f be 
a u,-summable function. Then by the Lebesgue-Stieltjes integral of f (with 
respect to F), denoted by 


[£0 aFeo), (6) 
we simply mean the Lebesgue integral 


J. £0) dee 


Example 1. Let F be the jump function 
F(x) = Y hy 


in<z 
so that wp is a discrete measure. Then (6) reduces to the sum 
Lf )hy- 
Example 2. If F is absolutely continuous, then 


PP f00 dre) = f° f@)F'@) dx, (7) 
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where the right-hand side is the integral of fF’ with respect to ordinary 
Lebesgue measure on the line. In the case where f(x) = const, this is an 
immediate consequence of (4). Moreover, by the o-additivity of integrals, 
(7) can be extended to the case of any simple function f which is uy- 
summable. More generally, let {/,} be a sequence of such simple functions 
converging uniformly to f, so that {/,F’} converges uniformly to fF’. It can 
be assumed without loss of generality that 
AiG) < fo(x) <actt < fia(X) Sens 

and hence that 

ACEO) < OV) < +++ < fr@)FQ) <0. 
Therefore, applying Levi’s theorem (Theorem 2, p. 305) to both sequences 
{f,} and {f,F"}, we get 


[° 700) are) = lim f°.) dF(x) = lim |? f,@)P@) dx = f° fO)F'@) ax. 


Example 3. Suppose 
F(x) = D(x) + A(x), 
where D is the jump function 


D(x) = > Ah, 


Basu 


and A is absolutely continuous. Then it follows from Examples 1 and 2 that 


[P7G) 4F@) = TI Cdhn + PFA‘) ax. 


In the case where F also contains a singular component, as in (5), there is no 
such representation of the Lebesgue-Stieltjes integral (6) as the sum of a series 
and an ordinary Lebesgue integral. 


Remark. We can easily extend the notion of a Lebesgue-Stieltjes integral 
with respect to a nondecreasing function F to that of a Lebesgue-Stieltjes 
integral with respect to an arbitrary function of bounded variation ®. In 
fact, as in Theorem 4, p. 331, let 

®=v—g, 
where v, the total variation of ® on the interval [a, x], and g =v — ® are 
both nondecreasing. We then set 


Pre) a@cx) = [? fo aoc) — [°F dees) (8) 
by definition (see Problem 2). 


36.3. Applications to probability theory. The Lebesgue-Stieltjes integral 
is widely used in mathematical analysis and its applications. The concept 


366 MORE ON INTEGRATION CHAP, 10 


plays a particularly important role in probability theory. Given a random 
variable &,° let 


F(x) = P{E < x}, 


i.e., let F(x) be the probability that & takes a value less than x. Then F is 
clearly nondecreasing and continuous from the left. Moreover, F satisfies 
the conditions 

F(-o)=0, F(wo)=1 


(why ?). Conversely, every such function f can be represented as the prob- 
ability distribution of some random variable &. 

Two basic numerical characteristics of a random variable & are its 
mathematical expectation or mean (value) 


EE = Se dF(x), (9) 


and variance 


De = |" ( — B8)*dF(x) (10) 
(however, see Problem 5). 


Example 1. A random variable & is said to be discrete if it can take no 
more than countably many values x1, X2,...,X,,... - For example, the 
number of calls received on a given telephone line during a given time 
interval is a discrete random variable. Let 


Pn = PLE = xy} (n= 1,2,...) 


be the probability of the random variable § taking the value x,. Then the 
distribution function of & is just the jump function 
F(x) = > Py: 
; Cys 
In this case, the integrals (9) and (10) for the mean and variance of & reduce 
to the sums 
Ez = > XnPno 


DE= 2 (x,—4)'P, (a = E€). 
Example 2. A random variable & is said to be continuous if its distribu- 
tion function F is absolutely continuous. The derivative 
P(x) = F'(x) 


®* We presuppose familiarity with the rudiments of probability theory. See e.g., ¥. A. 
Rozanov, Introductory Probability Theory (wanslated by R. A. Silverman), Prentice-Hall, 
Inc., Englewood Cliffs, N.J. (1969). 
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of the distribution function is then called the probability density of &. It 
follows from Example 2, p. 364 that in this case the integrals (9) and (10) 
for the mean and variance of & reduce to the following integrals with respect 
to ordinary Lebesgue measure on the line: 


EE = [® xp(x) dz, 
pe = [* (x —a)'p(e)dx (a = EE). 


36.4. The Riemann-Stieltjes integral. Besides the Lebesgue-Stieltjes inte- 
gral introduced in Sec. 36.2 (which is in effect nothing but the difference 
between two ordinary Lebesgue integrals with respect to two measures on the 
real line), we can also introduce the Riemann-Stieltjes integral, defined 
as a limit of certain approximating sums, analogous to those used to define 
the ordinary Riemann integral. To this end, let f and ® be two functions on 
{a, 5], where ® is of bounded variation and continuous from the left, and let 


A=Xy KX Xye<cte+ <x, =| 


be a partition of the interval [a, b] by points of subdivision x9, x1, X2,..., 
x, Choosing an arbitrary point &, in each subinterval [x,1, x,], we form 
the sum 


n 
> FEN@C) — OC) (11) 
Suppose that as the partition is “refined,” i.e., as the quantity 
max {xy — X9,X%_ — X%y,---5Xn — Xana} (12) 


(equal to the maximum length of the subintervals) approaches zero, the sum 
(11) approaches a limit independent of the choice of both the points of 
subdivision x, and the “intermediate points” &. Then this limit is called 
the Riemann-Stieltjes integral of f with respect to ®, and is denoted by 


[2 FG) 00x) 
(just as in the case of the Lebesgue-Stieltjes integral). 
Remark. If ® = D, + ®,, then 


[£0 a(x) = [? £) do) + [PF@) dO) (13) 


(provided the integrals on the right exist). In fact, we need only write the 





10 Recall formula (8). 
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identity 
S FEMOG) — OG] 

=$ FENG) — UMN +E [ENO — Or dh 


and then pass to the limit as the quantity (12) approaches zero. 


THEOREM 1. If f is continuous on [a,b], then its Riemann-Stieltjes 
integral exists and coincides with its Lebesgue-Stieltjes integral. 


Proof. The sum (11) can be regarded as the Lebesgue-Stieltjes integral 
of the step function 
Tit =& if xyyr<x<x, (k=1,...,7). 
As the partition of [a, 5] is refined, the sequence {f,,} converges uniformly 


to f (why?). Hence, by the very definition of the Lebesgue integral 
(recall p. 294), 


tim [° f(x) dx = 1, 
where J is the Lebesgue-Stieltjes integral of fover [a, b). But then 
lim p> S (XP) — O% 0] = 1 
where the limit on the left is the Riemann-Stieltjes integral of f over 
[2,6]. Wl 
THEOREM 2. If f is continuous on [a, b], then 
| [°F d@()| < VAP) max [f@I, (14) 
. a<r<d 
where V2(®) is the total variation of ® on [a, b]. 


Proof. The inequality 
& AEN) — OH] 





=P [FEI [®Ce,) — OC%,-v| 


< max |f(2)] 3 1%) — OG al < VE) max |/)| 


holds for any partition of the interval [a, b]. Taking the limit of the 
left-hand side as max {x; — Xo,... Xn — Xn-1}— 0, we get (14). ff 


Remark. If ®(x) = x, (14) reduces to the familiar estimate 


| f° £09 ax | < © = a) max | f(*) 
aSr<Qb 


for the ordinary Riemann integral. 
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THEOREM 3. Let ® be a function of bounded variation on [a, b], different 
from zero at no more than countably many points cy, Ce, ... 5 Cas ... in 
(a, b). Then 


[ ° #(x) d®(x) = 0 (15) 
for any function f continuous on [a, b}. 


Proof. The assertion is obvious if ® is nonzero at only a single point 
c, € (a, b), since then 


Zz (% P(x.) — B(%-1)] = 0 
for an “arbitrarily fine” partition 
A=X<My << x, = +, 


i.e., a partition for which the quantity (12) is arbitrarily small, provided 
we make sure that c, is not one of the points of subdivision Xo, %1,.-., 
X,-4 Hence, by (13), the assertion is also true if ® is nonzero at only 
finitely many points in (a, b). Now suppose ® is different from zero at 
countably many points 


Cy, Coys ea Cases 
in (a, 5), and let 
Wn = D(c,). 
Then 
lye |, 
n=1 


since ® is of bounded variation. Given any « > 0, we choose N such that 


al <s, 
n= 
and write ® in the form 
® = Oy + O*, (16) 


where Oy takes the values y,,..., Vy at the points c,,..., cy and is 
zero elsewhere, while ®* takes the values yy41, Yyo,--- at the points 
Cyi1s Cyr, ... and is zero elsewhere. Then, as just shown, 


P £0) dOy(x) = 0. a7) 


Moreover 








S fEdO*y) — Oy) |< 2M 5 Iy_l < 2Me, 
k=1 n=N+1 





1 Note that here we rely on the fact that c; is not an end point of [a, 6]. 
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where M = max |f()I, 


agrzep 
or 


| {° rx) a@*(x) | < 2Me 


after taking the limit as m — oo. This in turn implies 


if f(x) d®(x) = 0, (18) 


since « > 0 is arbitrary. Formula (15) now follows at once from (13) 
and (16)-(18). §j 


36.5. Helly’s theorems. In Sec. 30.1 we found conditions insuring the 
validity of passing to the limit in Lebesgue integrals, i.e., conditions under 


hich 
whic lim a SAX) du = 0 f(x) dp, (19) 


where {/,} is a sequence of functions converging (almost everywhere) to a 
function f and the integrals are all with respect to a fixed measure uw. In 
the case of Stieltjes integrals, we now ask a closely related but somewhat 
different question: Under what conditions does the formula 


lim i f(x) d®,(x) = if f(x) d(x) (20) 


hold, where f is continuous and {®,,} is a sequence of functions of bounded 
variation converging (everywhere) to a function ®? (Note that here, unlike 
(19), the function f is fixed, and it is the function ®,, or the corresponding 
Stieltjes measure, which varies.) The answer to this question is given by 


THEOREM 4 (Helly’s convergence theorem). Let {D,} be a sequence of 
functions of bounded variation on [a, b], converging to a function ® at every 
point of [a,b], Suppose the sequence of total variations {V>(®,)} is 
bounded, so that 

Vi@,)<C (n=1,2,...) (21) 


for some constant C > 0. Then ® is also of bounded variation on [a, 6), 
and (20) holds for every function f continuous on {a, b]. 


Proof. Let 
A=HXKXy<c <x, = 4 


be any partition of the interval [a, b] by points of subdivision xp, x1,..., 
Xm. Then 


™ 


Y1OG,) — O64 2) Jim $10,654) — OG al < C, 


k=1 
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and hence 
Vi@) < C, (22) 


i.e., D is of bounded variation on [a, 5], as asserted. 
Next we show that (20) holds if fis a step function. Suppose 


f@=h if xy <x < xy. 
Then 


Pf) 40,0) = F hl®,@) ~ O,)] (23) 
and!? 


PA) 40) = F LOC) — O64) (24) 


where obviously (23) approaches (24) asm —> oo. Nowlet f becontinuous 
on [a, b]. Given any « > 0, choose a step function J, such that 


fx) — f@)| < me (a<x<b) (25) 
(why is this possible ?). Then 
[700 400) — f° 709 d0,@)| < Ih +h + lb 29) 





where 


1 = [? $0) d®~x) — f° £2) doO9, 
2 ‘ ° £(x) d®(x) — i F(x) d®,(x), 


1, = J’ fA) d®,(x) — f° fs) d®, (2). 


By the inequality (14), which clearly holds for Lebesgue-Stieltjes integrals 
as well as for Riemann-Stieltjes integrals (why ?), we have 


Il < PUP) — LAD dE) < = VR) < s, 
b € § € (27) 
Mal < PEG) — £091 4,0) <5 VA@,) < 5, 
after using (21), (22) and (25). Moreover, as just shown, 
Ih] <= (28) 


3 





#2 Think of (23) and (24) as Lebesgue-Stieltjes integrals. 
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for sufficiently large n. It follows from (26)~(28) that 


| [2769 docx) — [°F €0,00| <s, 
which implies (20), since « > 0 is arbitrary. Jj 


Theorem 1 gives conditions under which we can take the limit of a se- 
quence {®,} of functions of bounded variation inside a Stieltjes integral. 
The next theorem gives conditions guaranteeing the existence of a sequence 
{®,,} meeting the requirements of Theorem 4. 


THEOREM 5 (Helly’s selection principle). Let © be a family of functions 
defined on an interval [a, b] and satisfying the conditions 


Vice) < C, sup |e)! <M (29) 
asta 


for suitable C and M. Then © contains a sequence which converges for 
every x & [a, b}. 


Proof. It is enough to prove the theorem for nondecreasing functions. 
In fact, let 


9=v—§; 
where v is the total variation of ¢ on [a, x]. Then the functions v corre- 


sponding to all ¢ € ® are nondecreasing and satisfy the conditions of 
the theorem, since 


Viv) = Vag) < C, sup jo(x)| < C. 
axed 


Assuming that the theorem holds for nondecreasing functions, we choose 
a sequence {¢,} from ® such that v,, converges to a limit v* on [a, 5). 
Then the functions 

§n = Un — Pn 
are also nondecreasing and satisfy the conditions of the theorem (why ?). 
Therefore {~,} contains a subsequence {¢,,} such that {g,,} converges 
to a limit g* on [a, 5]. But then 


lim ,,(x) = ¢*(%), 
where open 
p*(x) = v*(x) — g*(). 
Thus we now proceed to prove the theorem for nondecreasing 


functions. Let ry, fo,...,/n,-.. be the rational points of [a, 5]. It 
follows from (29) that the set of numbers 


(nn) (pe) 
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is bounded. Hence there is a sequence of functions {¢} converging at 
the point r,. Similarly, {¢} contains a subsequence {¢/”} converging 
at the point r, as well as at r,, {9%} contains a subsequence {¢°)} 
converging at the point r, as well as at r, and rz, and so on. The ‘diagonal 
sequence”’ 

{bn} 7 {ey 


will then converge at every rational point of [a, 5). The limit of this 
sequence is a nondecreasing function , defined only at the points 
11, 1o,..+5Tns++. . We complete the definition of ) at the remaining 
points of [a, b] by setting 


v(x) = lim f(r) if x is irrational. 
r'rational 


The resulting function } is then the limit of {p,} at every continuity 
point of ). In fact, let x* be such a point. Then, given any « > 0, there 
is a 8 > 0 such that 


IMs") — $091 <= (30) 
if 
|x* — x] <8. 
Let r and r’ be rational numbers such that 
x*—8<r <x*¥<r" <x* +6, 


and let n be so large that 
dal) = HON < Es Wal") = HO <E- (31) 
It follows from (30) and (31) that 
alr) — HO <=. 


Since yp, is a nondecreasing function, we have 


ball’) < balx*) < bn(r’), 


and hence 
Ib(x*) — balx*)] < [b0*) — bO)1 + 1009) — a ‘ 
) * Bei Sg ees 
+ I¥,(7') Val Ire &. 
Therefore 
lim Y,(x*) = (x*), 


n 
since ¢ > 0 is arbitrary. 
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Thus we have constructed a sequence {),,} of functions in ® con- 
verging to a limit function ) everywhere except possibly at discontinuity 
points of . Since there are no more than countably many such points 
(why ?), we can again use the “diagonal process”’ to find a subsequence 
of {),} which converges at these points as well, and hence converges 
everywhere on (a, b]. §f 


36.6. The Riesz representation theorem. Next we show how Stieltjes 
integrals can be used to represent the general linear functional on the space 
Cia.n) Of all functions continuous on the interval [a, b]: 


THEOREM 6 (F. Riesz). Every continuous linear functional 9 on the 
space Cjq,y, can be represented in the form 


of) = J' £09) d(x), (32) 
where ® is a function of bounded variation on [a, b], and moreover 
ell = Vic). (33) 


Proof. The space C;,,,,; can be regarded as a subspace of the space 
Ma.» Of all bounded functions on [a, b], with the same norm 


Wl = Sup lt (x)| 


as in Crag}. Let ¢ be a continuous linear functional on C,,,,;. By the 
Hahn-Banach theorem (Theorem 5, p. 180), ¢ can be extended without 
changing its norm from C;,,,,; onto the whole space M,,,,;. In particular, 
this extended functional will be defined on all functions of the form 


1 if x <7, 


fa{ (a<+<Dd). (34) 
0 if x > 
Let 
D(z) = of). (35) 
Then ® is of bounded variation on [a, b]. In fact, given any partition 
Aa=%X%y<y << x, =5 (36) 


of [a, 5], let 


a, = sgn [P(x,) — P(x,_1)] (k =1,...,n), 
where 
1 if x>0, 


sgn x = 0 if x =0, 
—1 if x<0. 
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Then 
S10) — Od =Z aly) — OI] 
= 3210 — fad = (54a Sno) 


< dol | Sa — Sand 








But the function 
n id 
Pers — Su,-1) 


can only take the values 0, -L1, and hence its norm equals |. Therefore 


Zon) — 1 < I) 
Since this is true for any partition of [a, b], we have 


Vi(®) < Hel, (37) 
i.e., ® is of bounded variation on [a, b], as asserted. 

We now show that the functional ¢ can be represented in the form of a 
Stieltjes integral with respect to the function ® just constructed. Let f 
be any function continuous on [a, b]. Given any « > 0, let 8 > 0 be 
such that |x’ — x”| <8 implies | f(x’) —f(x")| < «. Suppose the 
partition (36) is such that each subinterval [x,_1, x,] is of length less than 
5, and consider the step function 

f° OVS fG)). 1 Bae YS! ORS acgn) 


which can obviously be written in the form 
LOR) = sf Oa) fo) — fxs], (38) 


where f, is the function defined by (34). Clearly, 
If) —fP@)| <e 


for all x € [a, b],¥ ive., 


If-—fPll <e. (39) 
It follows from (35) and (38) that 


Af) =E SOP San) — Cad = ZL ODO) — OD 





18 We complete the definition of f‘) by setting f(©)(b) = f(x,) = f(b) for every « > 0. 
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ie., o( f) is an “approximating sum” of the Riemann-Stieltjes integral 


ib ” £(x) d(x). 


Therefore 
| ocr) — [709 ao) | <e 
for a “sufficiently fine’’ partition of the interval [a, 6]. On the other 
hand, 
lo) — of) < Nell IS -F < Ilglle 
because of (39). But then 


| on) — JP 29 40] < (ell + Ds, 


which implies (32), since « > 0 is arbitrary. To prove (33), we merely 
combine (37) with the opposite inequality 


loll < Vi), 


which is an immediate consequence of Theorem 2 and the representation 


(32). if 


Problem 1, Let uw be an arbitrary finite c-additive measure on the real 
line (—©o, 00). Represent u as the Stieltjes measure corresponding to some 
generating function F. 


Hint. Let F(x) = u(—, x). 


Comment. Thus the term “‘Stieltjes measure”’ does not refer to a special 
kind of measure, but rather to a special way of constructing a measure (by 
using a generating function). 


Problem 2. Let ® be a function of bounded variation with two distinct 
representations P = v — g, ® = v* — g* in terms of nondecreasing functions 
v, g, v* and g* (give an example). Prove that 


[279 doce) — f°) dace) = [PFC ao*@xy — J’ 70) dg*Co), 


Comment. Thus in the definition (8) of the Lebesgue-Stieltjes integral 
with respect to a function of bounded variation ®, the particular representa- 
tion of ® as a difference between two nondecreasing functions does not 
matter, i.e., v need not be the total variation of ® on [a, x]. 
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Problem 3. Let & be the number of spots obtained in throwing an unbiased 
die. Find the mean and variance of &. 


Ans. EE = 3, DE = 38. * 

Problem 4. Find the mean and variance of the random variable & with 
probability density 

p(x) = 4e7!*! (~0 <x < o), 

Problem 5. Let & be the random variable with probability density 
Reet Cee 
m(1 + x°) 
Prove that E& and DE fail to exist. 


p(x) = (—0o <x < 0), 


Problem 6. Discuss random variables which are neither discrete nor 
continuous. 


Problem 7, Given a random variable & with distribution function F, 
consider the new random variable 7 = ¢(&), where ¢ is a function summable 
with respect to the Stieltjes measure 4, generated by F. Express E& and 
Dé in terms of F. 


Hint. Consider the problem of changing variables in a Lebesgue integral. 
Ans. For example, E& = ‘ee (x) dF(x). 


Problem 8. Prove that if f is continuous on [a, b], then the Riemann- 
Stieltjes integral 


[40 abe) (40) 
does not depend on the values taken by ® at its discontinuity points in (a, b). 


Hint. Use Theorem 3 and formula (13). 


Comment, Hence if f is continuous, we need not insist that D be con- 
tinuous from the left at its discontinuity points in (@, b). In fact, ® can be 
assigned arbitrary values at these points. 


Problem 9. Write formulas for the Riemann-Stieltjes integral (40) in the 
case where f is continuous and 


a) ® is a jump function; 
b) ® is an absolutely continuous function with a Riemann-integrable 
derivative. 
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Problem 10. Evaluate the following Riemann-Stieltjes integrals: 


0 if x=—1, 


a) f° xdF(x), where F(x)={ 1 if -1<x<2, 
1 if 2<x<3; 
-1 if O<x<}, 
if d<x <3, 
b) [° x* dF(x), where F(x) = ; : 
? if x= 3, 
—2 if $<x<2; 
x if O<x<f, 
c) fe dF(x), where F(x) = { | 
° = if 4 <*< 1 
x 


Problem 11. Develop a theory of Riemann-Stieltjes integration on the 
whole real line (— 00, 00). 

Problem 12, Extend Theorem 4 to the case where a = —oo or b= 
(or both), assuming that f(x) approaches a limit as x + +00, 

Problem 13. Let {®,} be the same as in Theorem 4, and let {f,} be a 
sequence of continuous functions on [a, 6] converging uniformly to a limit f. 
Prove that 


lim i) ” £ (x) d®,(x) = f ” #(x) d(x). 


Problem 14, Prove that there is a one-to-one correspondence between 
the set of all continuous linear functionals ¢ on C;,,,; and the space V;° ,, 
of Problem 8, p. 332, provided we identify any two elements of V;,? ,, which 
coincide at all their continuity points. Prove that the inequality 

vx®) < lel 
need not hold for every ®€ V2 ,, corresponding to a given functional 
9 €C,,9;, but that there is always at least one such element ® for which 
the inequality holds. 


37. The Spaces L, and L, 


37.1. Definition and basic properties of L,. Let X be a space equipped 
with a measure p, where the measure of X itself may be either finite or 
infinite. Then by L,(X, w), or simply L,, we mean the set of all real functions 
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f summable on X (however, see Problem 1). Clearly L, is a linear space 
(with addition of functions and multiplication of functions by numbers 
defined in the usual way), since a linear combination of summable functions 
is again a summable function. To introduce a norm in L,, we define 


If = | fedl de, (1) 


where, as in the rest of this section, the symbol f by itself denotes integration 
over the whole space X. Of the various properties of a norm (see p. 138), 
it follows at once from (1) that 


If ll > 0, 
laf] = lol If. 
At fil < WAN + Il. 

and we need only verify that || f|| = 0 if and only if f= 0. To insure this, 
we agree to regard equivalent functions (i.e., functions differing only on 
a set of measure zero) as identical elements of the space L,. Thus the 
elements of L, are, to, be perfectly exact, classes of equivalent summable 
functions.“ In particular, the zero element of L, is the class consisting of all 
functions vanishing almost everywhere. With this understanding, we will 


continue to talk (more casually) about “functions in L,.”’ 
In L,, as in any normed linear space, we can use the formula 


ef, g) = If — al 


to define a distance. Let {f,} be a sequence of functions in L,. Then {/,} 
is said to converge in the mean to a function fe L, if e(f,,f) > 0 as n— oo. 


TueoreM 1. The space L, is complete. 
Proof. Let {f,} be a Cauchy sequence in Lj, so that 
lfm —Snll > 0 as m, n—> 00, 

Then we can find a sequence of indices {n,} (where nm < ny < +++ < 
ny, < +++) such that 

Ina —Srvall = fal) Saves de <5, (ke 
It follows from the corollary to Levi’s theorem (see p. 307) that the series 

[Fast ag Sha, 


4 Thus the precise definition of addition of two elements 91, 92 € Ly is the following: 
Let fi and fz be “representatives” of 9, and 92, respectively, i.e., let fi € 91, fo € 2. Then 
91 + $2 is the class containing f; + f2 (this class clearly does not depend on the particular 
choice of f; and f2). 


l 
— 
. 
Xv 
. 
~~ 
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converges almost everywhere on X. Therefore the series 


fas + fg =f Spe 


also converges almost everywhere on X to some function 
S(%) =lim f,, (x). 
k7a@ 


But {f,,} converges in the mean to the same function f. In fact, given 
any « > 0, 


J fad Saul du <e (2) 


for sufficiently large k and /, since {f,,} is a Cauchy sequence. Hence, 
by Fatou’s theorem (Theorem 3, p. 307), we can take the limit as /—> 0 
behind the integral sign in (2), obtaining 


J \fal®) —/O)] du <e. 


It follows that f ¢ L, (why?) and that/,, > fin the mean. Butifa Cauchy 
sequence contains a subsequence converging to a limit, then the sequence 
itself must converge to the same limit. Hence f, > fin the mean. fj 


According to the definition of the Lebesgue integral (see p. 296), given 
any function f summable on X and any < > 0, there is a summable simple 
function ~(x) such that 


Jf) - eI <e. 


Moreover, the Lebesgue integral of a summable simple function 9 taking 
values y,, Yo,... on sets E,, E,,... is defined as the sum of the series 


ao 


D> Yab(En) 


n=1 
(assumed to converge absolutely). Therefore every summable simple function 
can be represented as the limit in the mean (i.e., as the limit in the sense of 
convergence in the mean) of a sequence of summable simple functions, 
each taking only finitely many values. In fact, given any ¢ > 0, let N be 
such that 


D> [yal w(En) <<, 
and let!® janie 


7 if xef Ll <k<N, 
n(x) = 


0 otherwise. 





2 
48 Note that oy is a finite linear‘combination of characteristic functions, namely 


P(x) = Vite) + °° + Ywkey) 
(see footnote 11, p. 349). 


SEC, 37 THE SPACES L; AND L, 381 


Then 
J 19) — en de <5 Wal wn) <e. 


In other words, the set of all simple functions taking only finitely many values 
is everywhere dense in the space I}. 


THEOREM 2. Let X be a metric space equipped with a measure yu. such 
that 


1) Every open set and every closed set in X is measurable; 
2) Ifa set M © X is measurable, then 


u(M) = inf u(G), (3) 
McG 


where the greatest lower bound is taken over all open sets GC X 
containing M. 
Then the set of all continuous functions on X is everywhere dense in 
LX, ’ v). 


Proof. We need only show that every simple function taking only 
finitely many values is the limit in the mean of a sequence of continuous 
functions. But every simple function taking only finitely many values is 
a finite linear combination of characteristic functions of measurable sets, 
and hence we need only show that every such characteristic function 
%m(x) is the limit in the mean of a sequence of continuous functions. 
If M < Xis measurable, then (3) implies that given any ¢ > 0, there is a 
closed set Fy, and an open set Gy, such that 


Fy SC MCGy,  u(Gy) — u(Fy) <e. (4) 
Now let?” 
X — Gy,x 
9,(x) = e( M ) 
e(X — Gy, x) + PF, x) 
Then 


0 if xeX — Gy, 
(x) = 

1 if xe Fy. 
Moreover, 9, is continuous, since e(Fy,, x) and o(X — Gy, x) are both 
continuous functions, with a nonvanishing sum. But |yj, — ¢,| does not 
exceed 1 on Gy, — Fy, and vanishes outside this set. Using (4), we find that 


fe) — ee) du<e. I 


16 These conditions are satisfied by ordinary Lebesgue measure in n-space, and in 
many other cases of practical interest. 

1” As usual, e(A, x) denotes the distance between the set 4 and the point x (see Problem 
9, p. 54), 
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The space L,(X, w) depends on the choice of both the space X and the 
measure w. For example, L,(X, w) is essentially a finite-dimensional space 
if w is concentrated on a. finite set of points (why?). In analysis, we are 
mainly interested: in the case where L, is infinite-dimensional but has a 
countable everywhere dense subset.1® To characterize such spaces, we 
introduce the following concept, stemming from general measure theory: 


DEFINITION. Suppose a space X equipped with a measure w has a 
countable system Lf of measurable subsets A,, Ay, .. . such that given any 
e > 0 and any measurable subset M < X, there is a set A, € & satisfying 
the inequality 

u(M A A,) <e. 


Then p. is said to have a countable base, consisting of the sets A,, Ag... 


Example. Let uw be a Lebesgue extension of a measure m originally 
defined on a countable semiring %,. Then the ring Z(X%,) is obviously 
itself countable, and hence, by Theorem 3, p. 277, is a countable base for p. 
In particular, ordinary Lebesgue measure on the line has a countable base, 
since we can choose the original semiring -%,, to consist of all intervals (open, 
closed and half-open) with rational end points. 


THEOREM 3. Let X be a space equipped with a measure w, and suppose 
u has a countable base A,, A,,... . Then L,(X, p) has a countable 
everywhere dense subset. 


Proof. We will show that the set M of all finite linear combinations 
of the form 


Seoahl, () 


where f;, is the characteristic function of A, and the numbers ¢,..., ¢, 
are rational, forms a countable everywhere dense subset of L, = L,(X, p). 
The countability of M is obvious, and we need only show that M is 
everywhere dense in L,. As already noted, the set of all simple functions 
taking only finitely many values is everywhere dense in L,. But every such 
function can be approximated arbitrarily closely by a function of the same 
type taking only rational values. Hence we need only show that every 
function f taking rational values y,,..., y, on pairwise disjoint sets 
E,,..., £, (with X as their union) can be approximated arbitrarily 
closely in the L,-metric by functions of the form (5). Clearly, there is 
no loss of generality in assuming that the base 4,, Ag, .. . is closed under 
the operations of taking differences and forming finite unions and 
intersections (why ?). 





18 So that L, is separable, as defined on p. 48. 
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Now, according to the definition, given any « > 0, there are sets 
A,,...,A, such that 


ul(Z, — A,) VU (A, — E,)) < (kA =1,...,n). 
Let 


and define a function 
Yr if xe€A;,, 
* — n 
PO=\o it xex—UK 


k=1 
pixifeyA Sr}, 
and hence the left-hand side of 
[1FG) —f*0O de < 2 Cmax ly DeeeSO) AS*@)} 


Then clearly 


can be made arbitrarily small by choosing « > 0 sufficiently small. This 
proves the theorem, since f* is a function of the form (5). fj 


37.2. Definition and basic properties of L,. As we have seen, the space 
L, = L,(X, y) is a Banach space, ie., a complete normed linear space. 
However, L, is not Euclidean, since its norm cannot be derived from any 
scalar product. This follows from the “parallelogram theorem’ (Theorem 
15, p. 160). For example, if X = [0, 27] and p is ordinary Lebesgue measure 
on the line, then the condition 


IF+ gl? + IF — gl? = 201F 11? + llgll?) 


fails for the summable functions f(x) = 1, g(x) = sin x.!° To get a function 
space which is not only a normed linear space but also a Euclidean space, 
we now consider the set of functions whose squares are summable. 

Thus let X be a space equipped with a measure ., where we temporarily 
assume that u(X) < 0. Then by L,(X, w), or simply L., we mean the set of 
all real functions f whose squares are summable on X, i.e., which satisfy 
the condition 


[P°0) dp < 0 


(however, see Problem 6). As in the case of L,, we do not distinguish 
between equivalent functions (ie., functions differing only on a set of 
measure zero). 





19 As an exercise, show that the same kind of counterexample works quite generally. 
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THEOREM 4. If f and g belong to Lz, then so do af, f + g, and fg, where 
« is an arbitrary constant. In particular, L, is a linear space. 
Proof. Obviously af € Lz, since 
JlafCoF du = 0 $400) du < o. 
The fact that fg € L, follows from the inequality 
Fs < HPC) + 8°] (6) 
and Theorem 3, p. 297.20 But then f+ g € Ly, since 
LA) + 8@)P < f(x) + 2 1 f)8@)] + 87), 
where each term on the right is summable. fj 


Next we define a scalar product in Lg, setting 


(fg) = J S(x)g(x) du. 


This choice obviously has all the properties of a scalar product listed on 
p. 142: 


1) GJ) > O where (4, /) = 0 if and only if f = 0; 
2) 49) = 8S); 

3) Af, 8) = 4 8); 

4) (1,81 + 82) = (fav + G, 82)- 


(In asserting that (/,/) = 0 if and only if f= 0, we rely on the fact that 
every function vanishing almost everywhere is identified with the zero element 
of L,.) Thus L, is a Euclidean space, with the norm defined by the usual 
formula 


Ifl=VG) (7) 
(recall Theorem |, p. 142). In the case of Lz, (7) takes the form 
If = | [P2@) dp. 
By the same token, the distance between two elements /, g € L, is just 
(ha) = If all = [fired — eco du 
The quantity 
JUG) — g00F du = If — gl? 


is called the mean square deviation of the functions f and g (from each other). 





20 Setting g(x) = 1 in (6), we find that fe L, implies fe L, (provided that X is of finite 
measure). 
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Let {f,} be a sequence of functions in Ly. Then {f,} is said to converge in 
the mean square to a function f € L, if e(f,, ff) > 0 as n—> 0. 
In L,, as in any other Euclidean space, we have the Schwarz inequality 


I(Ag)| < Il llgll, 


which here takes the form 


| [7@)e@ dp | < | [Peo ae | [eo dy. (8) 


The L,-version of the triangle inequality 


If+ all < I fl + lel 


| JC) + e@OP dp < ' [PC du + | fe°@ du. 
In particular, replacing f by | /| and setting g(x) = 1 in (8), we get 


[fei de < Vac), / [#209 dp, (9) 


from which it is again apparent (cf. footnote 20) that fe L, implies fe L, 
if n(X) < ©. 


THEOREM 5. The space L, is complete. 


is clearly 


Proof. Let {f,} be a Cauchy sequence in Ls, so that 
lfm —Snll 0 as m,n—> oo, 


Then, by (9), given any « > 0, we have 


FU) — fal de < Vu(X) J [Un) — AAC de < eva(a) 


for sufficiently large m and n, i.e., {/,} is also a Cauchy sequence in the 
L,-metric. Repeating the argument given in the proof of the completeness 
of Ly, we choose a subsequence {/,,} from {/,} converging almost 
everywhere to some function f. Clearly, given any « > 0, we have 


fUnCO —faOP du <e (10) 


for sufficiently large k and /. Hence, by Fatou’s theorem (Theorem 3, 
p. 307), we can take the limit as /—> o behind the integral sign in (10), 
obtaining 


[UAC — fOOP du <e. 
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It follows that f¢ L, (why?) and that f,, > fin the mean square. But if 
a Cauchy sequence contains a subsequence converging to a limit, then the 
sequence itself must converge to the same limit. Hence f, > / in the 
mean square. 


We now drop the restriction u(X) < 0, allowing X to have infinite 
measure. In the case u(X) = 00, it is no longer true that fe L, implies 
f¢€ Ly, a fact deduced from (6) or (9) in the case u(X) < 0. For example, 
let X be the real line equipped with ordinary Lebesgue measure, and let 


f(x) = 





Vi+x ; 
Then f belongs to L, but not to Ly, since 


[2. =o f —— =n< ow, 


vite : ee ee 


Moreover, if a sequence {/,} converges to a limit f in the L,-metric, it 
follows from (9) that {/,} also converges to fin the L,-metric if u(X) < oo. 
However, this conclusion fails if u(X) = 00, as shown by the example 





give : if |x| <n, 
0 if |x| >, 


where {/,,} approaches no limit in ZL, but approaches the zero function in L, 
(give the details). Despite all this, we have”! 


THEOREM 5’. The space L, is complete even if u(X) = ©, provided 
that wp is o-finite. 


Proof. As in Sec. 30.2, let 
X= U Xn, u(X,,) < 0, 
where 
XC XC CXC: 
Moreover, given any function ¢ on X, let 
x if xeEX,, 
oa) = (x) n 
0 if x¢X,, 





*t Note that in the proof of the completeness of ZL, (Theorem 1), X¥ can have either 
finite or infinite measure. 
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so that 
pes a ( arn (n) 
foes au = fo) du tim J 9x) du =lim [9 x) du, 


if g is summable on X. Let {f,} be a Cauchy sequence in L., so that, 
given any « > 0, 


JUL) — HOOF du <e 
for all sufficiently large k and /. Then 
lim f UP @) —F OOF du = [ULC — AO du <s, 


and hence, a fortiori, 
J, Lf") — FP OOP du <e. (uy) 


But L,(X,,, 2) is complete, by Theorem 5, since u(X,,) < 00. Therefore 
{7,<"} converges in the metric of L,(X,, w) to a function f € L,(X,, w). 
Taking the limit as / > 00 behind the integral sign in (11), we get 


J, Lf") — FOF du <e (12) 


(why is this justified ?). Since (12) holds for every n, we can now take 
the limit as n — 00, obtaining 


lim [. f(x) — fF dp. < e. (13) 
Now let we 
fO=fMx) if xeX,. 
Then (13) implies 
[UROC) —fOOP dp <e. 
It follows that fe L,(X, ») and f, > fin the mean square. j 


Problem 1. A complex function is said to be summable if its real and 
imaginary parts are summable. Show that the considerations of Sec. 37.1 
carry over verbatim to the case where L, consists of all complex summable 
functions (defined on X). 


Problem 2. Prove that if each of the measures p, and uw, has a countable 
base, then so does their direct product uw = py X ps. 


Comment. In particular, Lebesgue measure in the plane (or more 
generally in n-space) has a countable base. 
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Problem 3. Let X be the interval [a, 5], and let » be ordinary Lebesgue 
measure on the line. Prove that the set F of all polynomials on [a, 5] with 
rational coefficients is everywhere dense in L,(X, w). 


Hint. Use Theorem 2 and the fact that every function continuous on 
[a, b] can be approximated in the mean (or even uniformly) by elements of 7. 


Problem 4. Prove that L,(X, w) is separable, i.e., has a countable every- 
where dense subset, if u has a countable base. 


Comment. Thus L,(X, w) is a Hilbert space if ~ has a countable base 
(we disregard the case where L2(X, 2) is finite-dimensional). It follows from 
Theorem 11, p. 155 that all such spaces are isomorphic, in particular, that 
L,(X, w) is isomorphic to the space /, of all sequences (x1, X2,...,Xn,--.) 
such that 

> x5 < &. 

n=1 
(in fact, /, corresponds to the case where the measure yp is concentrated ona 
countable set of points). 

Problem 5. Prove that every continuous linear functional 9 on L.(X, w), 
where p. has a countable base, can be represented in the form 


of) = | S80) dy, 
where g is a fixed element of L,(X, ). 
Hint. Recall Theorem 2, p. 188. 


Problem 6. Show that the considerations of Sec. 37.2 carry over verbatim 
to the case where L, consists of all complex functions f satisfying the condition 


J ULCOP ap < @, 


provided the scalar product of two such functions f and g is now defined as 


(f.8) = | £08) dy. 


Show that the resulting space L, is a complex Hilbert space if the measure p 
has a countable base (again disregard the finite-dimensional case). 


Problem 7. Let {f,} be a sequence of functions defined on a space XY 
equipped with a measure p such that w(X) < 00. Prove that 


a) If {f,} converges uniformly, then {/,} converges in the mean and in 
the mean square; 

b) If {/,} converges in the mean or in the mean square, then {/,,} con- 
verges in measure (as defined in Problem 6, p. 292); 
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c) If {f,} converges in the mean or in the mean square, then {f,,} contains 
a subsequence {f,,,} which converges almost everywhere. 


Hint, See Problem 9, p. 292. Alternatively, recall the proof of Theorem 1. 


Problem 8. Prove that the sequence of functions constructed in Problem 
8, p. 292 converges to f(x) = 0 in the mean and in the mean square, without 
converging at a single point. 

Problem 9. Give an example of a sequence of functions {/,} which con- 
verges everywhere on [0, 1], but does not converge in the mean. 


Hint. Let 
nif x€(O,1/n), 
Firlx) = 


0 otherwise. 


Problem 10. Give an example of a sequence of functions {f,} which 
converges uniformly, but does not converge in the mean or in the mean 
square. 


Hint. According to Problem 7a, we must have u(X) = o. Let 


+. if [xl<n, 
Six) = Jn 
0 if |x| > 7. 


Problem 11. Show that convergence in the mean need not imply con- 
vergence in the mean square, whether or not p(X) < 0. 


Problem 12. Let L,(X, w) be the set of all classes of equivalent (real or 
complex) functions f such that 


JifPaa<o (U<p<o), 


equipped with the norm 
1/p 
Il = ( fur au) 


Prove that L,(X, yw) is a Banach space. 
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Absolutely continuous charge, 347 
Absolutely continuous function, 336 
Absolutely summable sequence, 185 
Adjoint operator, 232 
in Hilbert space, 234 
Aleph null, 16 
Alexandroff, P. S., 90, 97 
Algebra of sets, 31 
Algebraic dimension, 128 
Algebraic number, 19 
Almost everywhere, 288 
Angle between vectors, 143 
Arzela’s theorem, 102 
generalization of, 107 
Axiom of choice, 27 
Axiom of countability: 
first, 93 
second, 82 
Axiom of separation: 
first, 85 
Hausdorff, 85 
second, 85 


Baire’s theorem, 61 
B-algebra (see Borel algebra) 
Banach, S., 138, 229, 238 
Banach space, 140 
Base, 81 

countable, 382 

neighborhood (local), 83 
Basis, 121 

dual, 185 
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Basis (cont.): 
Hamel, 128 
orthogonal, 143 
orthonormal, 143 
Bessel’s inequality, 150, 165 
Bicompactum, 96 
Binary relation (see Relation) 
Birkhoff, G., 28 
Bolzano-Weierstrass theorem, 101 
Borel algebra, 35 
irreducible, 36 
minimal, 36 
Borel closure, 36 
Borel sets, 36 
Bounded linear functional, 177 
norm of, 177 
Bounded real function, 110 
Bounded set, 65, 141, 169 
locally, 169 
strongly, 197 
weakly, 197 
B-set (see Borel set) 


Cc 


Cantor, G., 29 
Cantor function, 335 
Cantor set, 52 
points of the first kind of, 53 
points of the second kind of, 53 
Cantor-Bernstein theorem, 17 
Cardinal number, 24 
Cartesian product (see Direct product) 
Cauchy criterion, 56 
Cauchy sequence, 56 
Cauchy-Schwarz inequality, 38 
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Chain, 28 

maximal, 28 
Characteristic function, 349 
Charge, 344 

absolutely continuous, 347 

concentrated, on a set, 346 

continuous, 346 

density of, 350 

discrete, 347 

negative, 344 

negative variation of, 346 

positive, 344 

positive variation of, 346 

Radon-Nikodym derivative of, 350 

singular, 347 

total variation of, 346 
Chebyshev’s inequality, 299 
Choice function, 27 
Classes, 6 

equivalence, 8 
Closed ball (see Closed sphere) 
Closed graph theorem, 238 
Closed set(s), 49 

in a topological space, 79 

on the real line, 51 

unions and intersections of, 49 
Closed sphere(s), 46 

center of, 46 

nested (or decreasing) sequence of, 


radius of, 46 
Closure, 46, 79 
Closure operator, 46 
properties of, 46 
Codimension, 122 
Cohen, P. J., 29 
Compact space, 92 
countably, 95 
locally, 97 
Compactness, 92 
countable, 95 
relative, 97 
relative countable, 97 
Compactum, 92, 96 
metric, 96 
Complement of a set, 3 
Complete limit point, 97 
Complete measure, 280 
Completely continuous operator(s), 239 ff. 
basic properties of, 243-246 
in Hilbert space, 246-251 
Completely regular space, 92 


Completion (of a metric space), 62 
Component (of an open set), 55 
Conjugate space, 185 
of a normed linear space, 184 
second, 190 
strong topology in, 190 
third, 190 
weak topology in, 200 
weak”* topology in, 202 
Connected set, 55 
Connected space, 84 
Contact point, 46, 79 
Continuity, 44, 87 
from the left, 315 
from the right, 315 
uniform, 109 
Continuous charge, 346 
Continuous linear functional(s), 175 ff. 
order of, 182 
sufficiently many, 181 
Continuum, 16 
power of, 16 
Contraction mapping(s), 66 ff. 
and differential equations, 71-72 
and integral equations, 74-76 
and systems of differential equations, 
72-14 
principle of, 66 
Convergence almost everywhere, 289 
Convergence in measure, 292 
Convergence in the mean, 379 
Convergence in the mean square, 385 
Convergent sequence: 
in a metric space, 47 
in a topological space, 84 
Convex body, 129 
Convex functional, 130, 134 
Convex hull, 130 
Convex set, 129 
Convexity, 128 
Countability of rational numbers, 11 
Countable additivity, 266, 272 
Countable base, 382 
Countable set, 10 
Countably compact space, 95 
Countably Hilbert space, 173 
Countably normed (linear) space, 171 
complete, 173 
Cover, 83 
closed, 83 
open, 83 
Covering (see Cover) 


Curve(s): 
in a metric space, 112-113 
length of, 114, 115 
sequence of, 115 
rectifiable, 332 


D 
Decomposition of a set into classes, 6-9 
8-algebra, 35 
8-ring, 35 


Delta function, 124, 208 
Dense set, 48 
everywhere, 48 
nowhere, 48, 61 
Density, 350 
Derived numbers, 318 
left-hand lower, 318 
right-hand upper, 318 
Diameter of a set, 65 
Difference between sets, 3 
Differentiation : 
of a monotonic function, 318-323 
of an integral with respect to its upper 
limit, 323-326 
Dimension, 121 
algebraic, 128 
Dini’s theorem, 115 
Direct product, 238, 352 
of measures, 354 
Directed set, 29 
Dirichlet function, 289, 291, 301 
Discontinuity point of the first kind, 315 
Discrete charge, 347 
Discrete space, 38 
Disjoint sets, 2 
pairwise, 2 
Distance: 
between a point and a set, 54 
between two sets, 55 
properties of, 37 
symmetry of, 37 
Domain (of definition), 4, 5, 221 
Domain (open connected set), 71 


E 


Egorov’s theorem, 290 
Eigenvalue, 235 
Eigenvector, 235 
Elementary set, 255 
measure of, 256 
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Empty set, 2 
e-neighborhood, 46 
e-net, 98 
Equicontinuous family of functions, 102 
Equivalence classes, 8 
Equivalence relation, 7 
Equivalent functions, 288 
Equivalent sets, 13 
Essential supremum, 311 
Essentially bounded function, 310 
Euclidean n-space, 38, 144 
Euclidean space(s), 138, 142 ff. 
characterization of, 160 
complete, 153 
norm of vector in, 164 
orthogonal elements of, 164 
components of elements of, 149 
norm in, 142 
separable, 146 
Euler lines, 105 
Exhaustive sequence of sets, 308 
Extension of a functional, 132 
Extension of a measure, 271, 277, 279 
Jordan, 281 


F 


Factor space, 122 
Fatou’s theorem, 307 
Field, 37 
Finite expansion, 33 
Finite function, 208 
Finite set, 10 
First axiom of countability, 83 
First axiom of separation, 85 
Fixed point, 66 
Fixed point theorem, 66 
Fourier coefficients, 149, 152, 165 
Fourier series, 149, 165 
Fractional part, 8 
Fraenkel, A. A., 25, 27 
Fredholm equation, 74 
homogeneous, 74 
kernel of, 74 
nonhomogeneous, 74 
Friedman, A., 212 
Fubini’s theorem, 359 
Function space, 39, 108 
Functional(s), 108, 123 
addition of, 183 
additive, 123 
bounded linear (see Bounded linear 
functional) 


396 INDEX 


Functional(s) (cont.): 
conjugate-homogeneous, 123 
conjugate-linear, 124 
continuous, 175 
continuous linear (see Continuous linear 

functionals) 
convex, 130, 134 
extension of, 132 
homogeneous, 123 
linear, 124, 175 ff. 
Minkowski, 131 
null space of, 125 
product of, with a number, 183 
separation of sets by, 136 

Function(s), 4 ff. 
absolutely continuous, 336 
Borel-measurable, 284 
bounded (real), 110, 207 
Cantor, 335 
characteristic, 349 
continuous, 44, 79 

from the left, 315 
from the right, 315 
uniformly, 109 
delta, 124, 208 
domain (of definition of), 4, 5 
equivalent, 288 
essentially bounded, 310 
finite, 207 
general, 5 
generalized (see Generalized functions) 
generating, 362 
infinitely differentiable, 169 
integrable, 294, 296, 308 
locally, 208 
inverse, 5 
jump, 315, 341 
jump of, 315 
left-hand limit of, 315 
lower limit of, 111 
lower semicontinuous, 110 
measurable, 284 ff. 
monotonic, 314 
nondecreasing, 314 
nonincreasing, 314 
of bounded variation, 328-332 
one-to-one, 5 
oscillation of, 111 
range of, 4, 5 
real, 4 
right-hand limit of, 315 
simple, 286 


Function(s) (cont.): 

singular, 341 

step, 316 

summable, 294, 296, 308 

test, 208 

uniformly continuous, 109 

upper limit of, 111 

upper semicontinuous, 110 
Fundamental functions (see Test functions) 
Fundamental parallelepiped, 98 
Fundamental sequence (see Cauchy se- 

quence) 

Fundamental space (see Test space) 


G 


General measure theory, 269 ff. 
Generalized function(s), 124, 206 ff. 
and differential equations, 211-214 
complex, 215 
convergence of, 209 
definition of, 208 
derivative of, 210 
of several variables, 214-215 
on the circle, 216 
operations on, 209-210 
product of, with a number, 209 
product of, with an infinitely differenti- 
able function, 210 
regular, 208 
singular, 208 
sum of, 209 
Gédel, K., 209 
Graph, 238 
Greatest lower bound (in a partially ordered 
set), 30 
Gurevich, B. L., 350, 351 


H 


Hahn decomposition, 345 

Hahn-Banach theorem, 132, 180 
complex version of, 134, 181 

Hamel basis, 128 

Hausdorff axiom of separation, 85 

Hausdorff space, 85 

Hausdorff’s maximal principle, 28 

Heine-Borel theorem, 92 

Helly’s convergence theorem, 370 

Helly’s selection principle, 372 

Hereditary property, 87 

Hilbert, D., 155 


Hilbert cube, 98 
Hilbert space(s), 155 ff. 
complex, 165 
countably, 173 
isomorphic, 155, 165 
linear manifold in, 156 
closed, 156 
subspace(s) of, 156 
direct sum of orthogonal, 159 
(mutually) orthogonal, 158 
orthogonal complement of, 157 
Hilbert-Schmidt theorem, 248 
Hélder’s inequality, 41 
homogeneity of, 42 
Hdlder’s integral inequality, 45 
Homeomorphic mapping, 44, 89 
Homeomorphic spaces, 44, 89 
Homeomorphism, 44, 89 
Hyperplane, 127 


Ideal, two-sided, 252 
Image: 
of an element, 5 
of a set, 5 
Infimum, 51 
Infinite set, 10 
Initial section, 25 
Inner measure, 258, 276 
Integrable function, 294, 296, 308 
Integral part, 8 
Interior, 128 
Interior point, 50 
Intersection of sets, 2 
Into mapping, 5 
Invariant subspace, 238 
Inverse function, 5 
Invisible point: 
from the left, 319 
from the right, 319 
Isolated point, 47 
Isometry, 44 
Isomorphism, 21, 120, 155, 165 
conjugate-linear, 194, 234 
Isomorphism theorem, 155, 165 


J 


Jordan decomposition, 346 
Jordan extension, 281 
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Jordan measurable set, 281 
Jordan measure, 281 
Jump, 315 

Jump function, 315, 341 


K 


Kelley, J. L., 87, 90, 92, 97 
Kernel, 74 


L 


Lattice, 30 
Least upper bound (in a partially ordered 
set), 30 
Lebesgue decomposition, 341, 351, 363 
Lebesgue extension, 277, 279 
Lebesgue integral, 293 ff. 
absolute continuity of, 300-301 
as a set function, 343-351 
indefinite, 313 ff. 
of a general measurable function, 296, 
308 
of a simple function, 294 
over a set of infinite measure, 308 
vs. Riemann integral, 293-294, 309-310 
Lebesgue-integrable function (see Inte- 
grable function) 
Lebesgue-Stieltjes integral, 364 
vs. Riemann-Stieltjes integral, 368 
Lebesgue’s bounded convergence theorem, 
303 
Lebesgue’s theorem: 
on differentiation of a monotonic func- 
tion, 321 
on integration of the derivative of an 
absolutely continuous function, 340 
Left-hand limit, 315 
Levi’s theorem, 305 
Limit of a sequence: 
in a metric space, 47 
in a topological space, 84 
Limit point, 47, 79 
complete, 97 
Linear closure, 140 
Linear combination, 120 
Linear dependence, 120 
Linear functional, 175 ff. 
bounded (see Bounded linear func- 
tional) 
continuous .(see Continuous linear func- 
tionals) 
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Linear hull, 122 
Linear independence, 121 
Linear manifold, 140, 156 
Linear operator, 221 
bounded, 223 
norm of, 224 
spectral radius of, 239 
closed, 237 
completely continuous (see Completely 
continuous operators) 
graph of, 238 
Linear space(s), 118 ff. 
basis in, 121 
Hamel, 128 
closed segment in, 128 
complex, 119 
countably normed, 171 
dimension of, 121 
algebraic, 128 
finite-dimensional, 121 
functionals on (see Functionals) 
infinite-dimensional, 121 
isomorphic, 120 
linearly dependent elements of, 120 
linearly independent elements of, 121 
n-dimensional, 121 
normed (see Normed linear spaces) 
open segment in, 128 
real, 119 
subspace, 121 
proper, 121 
topological (see Topological linear space) 
Linearly ordered set (see Ordered set) 
Lipschitz condition, 55 
Locally integrable function, 208 
Lower limit, 111 
Lower semicontinuous function, 110 
Luzin’s theorem, 293 


M 


Mapping, 5 ff. 

continuous, 44, 87 

contraction, 66 

fixed point of, 66 

into, 5 

natural, 191 

one-to-one, 5 

onto, 5 

order-preserving, 21 
Mathematical expectation, 366 
Mathematical induction, 28 


Mean square deviation, 384 
Mean (value), 366 
Measurable function, 284 ff. 
integration of, 294, 296, 308 
Measurable set(s), 259 ff, 267 
decreasing sequence of, 266 
increasing sequence of, 267 
Jordan, 281 
Measure(s), 254 ff. 
additivity of, 255, 263 
complete, 280 
continuity of, 267 
countably (c-) additive, 266, 272 
direct product of, 354 
extension(s) of, 271, 275-283 
inner, 258, 276 
Jordan, 281 
Lebesgue, 259, 276, 279 
of an elementary set, 256 
of a plane set, 259, 276 
of a rectangle, 255 
on a semiring, 270 
outer, 258, 276 
product, 354 
o-finite, 308 
signed, 344 
Stieltjes (see Stieltjes measure) 
with a countable base, 382 
Measure space, 294 
Method of successive approximations, 66, 
67 
Metric (see Distance) 
Metric space(s), 37 ff. 
complete, 56 
completion of, 62 
continuous curves in, 112-113 
length of, 114, 115 
sequence of, 115 
continuous mapping of, 44 
convergence in, 47 
incomplete, 56 
isometric, 44 
isometric mapping of, 44 
real functions on, 108 
equivalent continuous, 113 
uniformly continuous, 109 
relatively compact subsets of, 101 
separable, 48 
subspace of, 43 
total boundedness of, 97-99 
compactness and, 99-101 
Metrizable space, 90 


Minkowski functional, 131 
Minkowski’s inequality, 41 
Minkowski’s integral inequality, 45 
Monotonic function, 314 


N 


n-dimensional simplex, 137 
k-dimensional face of, 137 
vertices of, 137 

n-dimensional (vector) space, 119 

Negative set, 344 

Neighborhood, 46, 79 

Neighborhood base, 83 
at zero, 168 

Nested sphere theorem, 60 

Noncomparable elements, 21 

Nondecreasing function, 314 

Nonincreasing function, 314 

Nonmeasurable set, 268 

Normal space, 86 

Normed linear space(s), 138 
bounded subset of, 141 
complete, 140 
complete set in, 140 
conjugate space of, 184 
direct product of, 238 
subspaces of, 140 

Norm(s), 138, 142, 163 
compatible, 171 
comparable, 172 
equivalent, 141, 172 
of a bounded linear functional, 177 
of a bounded linear operator, 224 
properties of, 138 
stronger, 172 
weaker, 172 

n-space, 119 

Null space, 125 


oO 


One-to-one correspondence, 5, 10, 13 
One-to-one function, 5 
Onto mapping, 5 
Open ball (see Open sphere) 
Open set(s), 50 
component of, 55 
in a topological space, 78 
on the real line, 51 
unions and intersections of, 50 
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Open sphere, 45 
center of, 46 
radius of, 46 
Operator(s), 221 ff. 
adjoint, 232 
in Hilbert space, 234 
continuous, 221 
degenerate, 240 
domain (of definition) of, 221 
eigenvalue of, 235 
eigenvector of, 235 
identity (or unit), 222 
inverse, 228 
invertible, 228 
linear (see Linear operator) 
product of, 225 
with a number, 225 
projection, 223 
resolvent of, 236 
self-adjoint, 235 
spectrum of, 235 
sum of, 225 
zero, 222 
Order type (see Type) 
Ordered product, 23 
Ordered set, 21 
Ordered sum, 22 
Order-preserving mapping, 21 
Ordinal, 24 
transfinite, 24 
Ordinal number(s), 24 
comparison of, 25 
Orthogonal basis, 143 
Orthogonal complement, 157 
Orthogonal! system, 143 
complete, 143 
Orthogonal! vectors, 143 
Orthogonalization, 148 
Orthogonalization theorem, 147 
Orthonorma! basis, 143 
Orthonormal system, 143 
closed, 151 
complete, 143 
vs. closed, 151 
Oscillation, 111 
Oyter measure, 258, 276 


P 


Parseval’s theorem, 151 
Partial ordering, 20 
Partially ordered set(s), 20 
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Partially ordered set(s) (cont.): 
isomorphic, 21 
maxima! element of, 21 
minimal element of, 21 
noncomparable elements of, 21 
Partition of a set into classes, 6-9 
Peano’s theorem, 104 
Petrovski, I. G., 76 
Picard’s theorem, 71 
Polygonal line, 55 
Positive set, 344 
Power: 
of a set, 16 
of the continuum, 16 
Preimage: 
of a set, 5 
of an element, 5 
Principle of contraction mapping, 66 
Probability density, 367 
Product measure, 354, 356 
evaluation of, 356-359 
Projection operator, 223 
Proper subspace, 121 


Q 


Quotient space (see Factor space) 
R 


Radon-Nikodym derivative, 350 
Radon-Nikodym theorem, 347 
Random variable, 366 
continuous, 366 
discrete, 366 
mathematical expectation of, 366 
mean (value) of, 366 
probability density of, 367 
variance of, 366 
Range. 4, 5 
Rectangle, 255 
closed, 255 
half-open, 255 
measure of, 255 
open, 255 
Rectifiable curve, 332 
Reflexive space, 191 
Reflexivity, 7 
Relation, 7 
antisymmetric, 7 
binary, 7 
equivalence, 7 


Relation (cont.): 
reflexive, 7 
symmetric, 7 
transitive, 7 
Relatively compact subset, 97 
Relatively countably compact subset, 97 
Residue class, 122 
Resolvent, 236 
Riemann integral, 293 
vs. Lebesgue integral, 293-294, 309-310 
Riemann-Stieltjes integral, 367 
vs. Lebesgue-Stieltjes integral, 368 
Riesz lemma, 319 
Riesz representation theorem, 374 
Riesz-Fischer theorem, 153 
Right-hand limit, 315 
Ring of sets, 31 
minimal, generated by a semiring, 34 
minimal, generated by a system of sets, 32 
Rozanov, Y. A., 366 


Ss 


Scalar product, 142 

complex, 163 
Schwartz, L., 212 
Schwarz’s inequality, 40, 142 
Second axiom of countability, 82 
Second axiom of separation, 85 
Self-adjoint operator, 235 
Semireflexive space, 191 
Semiring of sets, 32 

finite expansion in, 33 

minimal ring generated by, 34 
Separable (metric) space, 48 
Set of o-uniqueness, 282 
Set of uniqueness, 282 
Set theory, 1-36 

naive vs. axiomatic, 29 
Set(s), 1 ff. 

algebra of, 31 

bounded, 65, 141 

totally, 98 

Cantor, 52 

closed, 49 

closure of, 46 

complement of, 3 

connected, 55 

contact point of, 46 

convex, 129 

countable, 10 

curly bracket notation for, 1 


Set(s) (cont.): 
decomposition of, 6 
dense, 48 
everywhere, 48 
nowhere, 48, 61 
diameter of, 65 
difference between, 3 
direct product of, 352 
directed, 29 
disjoint, 2 
pairwise, 2 
duality principle for, 4 
elementary, 255 
elements of, 1 
empty, 2 
equivalent, 13 
exhaustive sequence of, 308 
finite, 10 
infinite, 10 
interior of, 128 
interior point of, 50 
intersection of, 2 
isolated point of, 47 
Jordan measurable, 281 
(Lebesgue) measurable, 259, 267, 276, 
279 
limit point of, 47 
complete, 97 
measure of, 259, 267, 276, 279 
negative, 344 
nonmeasurable, 268 
of uniqueness, 282 
of o-uniqueness, 282 
open, 50 
operations on, 2 ff. 
ordered, 21 
partially ordered, 20 
partition of, 6 
positive, 344 
power of, 16 
ring of, 31 
semiring of, 32 
subset of, 1 
proper, 2 
sum of, 2 
symmetric, 171 
symmetric difference of 3, 4 
systems of, 31-36 
totally bounded, 98 
uncountable, 10 
union of, 2 
well-ordered, 23 
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Shilov, G. E., 147, 155, 245, 350, 351 
o-additivity (see Countable additivity) 
c-algebra, 35 

o-finite measure, 308 

o-ring, 35 

Signed measure, 344 


Silverman, R. A., 76, 140, 147, 247, 350, 


366 

Simple function, 286 
Simplex (see n-dimensional simplex) 
Simply ordered set (see Ordered set) 
Singular charge, 347 
Singular function, 341 
Smirnov, V. I., 247 
Space: 

c, 120 

Co, 120 

Cfa,v}, 39, 57 

Cfa,»}: 40, 59 

c", 119 

CU, R), 113 

of isolated points, 38, 56 

of rapidly decreasing sequences, 172 

I, 39, 57 

ly, 43 

Ly, 378 

Le, 383 

m, 41, 120 

R?, 38, 56 

R*, 38, 57 

R®, 120 

Rt, 41 
Spectral radius, 239 
Spectrum, 235 

continuous, 236 

point, 236 

regular point of, 235 
Step function, 211, 316 
Stereographic projection, 14 


Stieltjes integral (see Lebesgue-Stieltjes 


integral) 

Stieltjes measure, 362, 364 
absolutely continuous, 363 
discrete, 363 
generating function of, 362 
singular, 363 

Strong convergence, 195 

Strong topology, 184 
in conjugate space, 190 

Subcover, 83 

Subset, 1 
proper, 2 
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Subspace, 121 
closed, 140 
generated by a set, 122 
invariant, 238 
proper, 121 
Successive approximations, method of, 66, 


Sum of sets, 2 
Summable function, 294, 296, 308 
complex, 387 
Supremum, 41, 51 
Symmetric difference, 3, 4 
Symmetric set, 171 
Symmetry, 7 
System of sets, 31 
centered, 92 
trace of, 80 
unit of, 31 


T 


Test functions, 208 
convergence of, 208 
Test space, 208, 216 
Tolstov, G. P., 140, 145 
Topological linear space, 138, 167 ff. 
bounded subset of, 169 
continuous mapping of, 87 
functionals on, 175 
continuous, 175 
continuous linear, 175 ff. 
linear, 175 
locally bounded, 169 
locally convex, 169 
neighborhood base at zero of, 168 
normable, 169 
weak topology in, 195 
Topological space(s), 78 ff. 
base for, 81 
bicompact, 96 
closed sets of, 79 
compact, 92 
completely regular, 92 
connected, 84 
convergence in, 84 
countably compact, 95 
cover (covering) of, 83 
hereditary property of, 87 
locally compact, 97 
metrizable, 90 
normal, 86 
open sets of, 78 


Topological space(s) (cont.): 
points of, 79 
real functions on, 108 
relatively compact subset of, 97 
relatively countable compact subset of, 97 
with a countable base, 82 
Topology, 78 
generated by a system of sets, 80 
relative, 80 
strong, 184, 190 
stronger, 80 
weak, 195, 200 
weak*, 202 
weaker, 80 
Total variation, 328, 346 
Totally bounded set, 98 
Transcendental number, 19 
Transfinite induction, 29 
Transfinite ordinal, 24 
Transitivity, 7 
Triangle inequality, 37, 138 
T,-space, 85 
T,-space, 85 
Two-sided ideal, 252 
Tychonoff space, 92 
Type(s), 22 
ordered sum of, 23 
ordered product of, 23 
vs. power, 22 


U 


Uncountability of real numbers, 15 
Uncountable set, 10 

Uniform continuity, 109 

Uniformly bounded family of functions, 102 
Union of sets, 2 

Unit (of a system of sets), 31 

Upper bound (in a partially ordered set), 28 
Upper limit, 111 

Upper semicontinuous function, 110 
Urysohn’s lemma, 91 

Urysohn’s metrization theorem, 90 


Vv 


van der Waerden, B. L., 327 
Variance, 366 
Variation: 

bounded, 328 

negative, 346 
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Vector space (see Linear space) 
Volterra equation, 75 
Volterra operator, 243 
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Weak convergence, 195 
of functionals, 200 
Weak* convergence, 202 
Weak topology, 195 
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Weak* topology, 202 
Weierstrass’ approximation theorem, 140, 
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